Tuesday, September 30, 2008

Don’t Ignore Naming Convention

Recently I worked with an organization having hundreds of DB2 databases and still a uniform and consistent naming convention across all these databases. Naming convention started from Server Name, Instance Name and Database Name itself. Just by looking at an Instance Name, you can tell whether it’s a production, development or QA instance and the associated application for that instance. Instance home directory, database path, log path, archive log path, database backup path ..., all followed a consistent naming convention.

In the above mentioned environment, if a file system is getting full on any of the DB2 database server, you immediately know if that file system is database related and which instance/database is using that file system. If a new DBA joins the team, he can quickly get familiar with the large environment just by learning the conventions. If a database has to be backed up, he knows the backup target and if a database has to be recovered, he knows where to find the backup image and archive logs, all because of consistent naming convention.

A good naming convention not only helps a new DBA quickly come up to speed, but it also simplifies automation of DBA tasks. Many of the decision points in automation can get benefitted by the established convention. Automation platform like Data Palette has the capability to capture all the variations and weirdness in your database environment and still enable the decision making. However, more variations often lead to complexity in implementing and managing the automation.

Just having any naming convention is not enough, you should have a good naming convention. I have described below some of the qualities of a good naming convention

1. Make your naming convention informative
You can name your instances like db2ins01, db2ins02 and so on. But such naming convention does not have much information except that it is a DB2 instance. Instead if your instance name can tell you whether it’s production, development or QA instance, then you get more information out of that naming convention. Also, if your instance name includes INST or your database name includes DB, it does not provide you any additional info if you already know that you are looking at an instance name or database name.

Similarly the directory structure of your instance, database and table space level paths should incorporate the corresponding instance name, database partition number, database name hierarchy. By looking at instance/database configuration and system catalog, you can figure out most of the associated paths. However, the reverse mapping i.e. you know the path and trying to figure out who is using that path is straightforward only if you have an appropriate naming convention.

Whenever a name is derived from some other base names, do not abbreviate or shorten the base names while using into derived names. For example, if your table space container path includes a directory with your database name, this directory name should have a complete database name and not abbreviated or short form. If you use the abbreviated or short name, your path name becomes less informative because now you need additional data (the mapping between full name and short name) to determine if that directory name corresponds to the database name.

2. Use fixed-width numbers in your naming convention
If your naming convention includes a sequence number, have a fixed width sequence number. For example, if you know that you will need max four digits for a sequence number in your naming convention, use 0001 instead of just 1. Also, it’s better to have your database partition numbers (in db2nodes.cfg) as fixed width. For example, if you have 16 partitions in your database from 0 to 15, have your partitions named as 00, 01, 02 and so on. Such partition numbers has a distinct advantage while using “##” in your db2_all command. The character sequence “##” (two hash characters) can be used to get replaced by partition number while running a command through db2_all. So if you use NODE00## in your db2_all command, it can get replaced as NODE0000, NODE0001...NODE0015 in the above example of a 16 partition database.

No matter how good a naming convention is, if it’s not followed consistently, it’s not useful. I started this post with an ideal example of consistency in naming convention. On the other extreme, I have also come across organizations where DBA’s come up with a new naming convention whenever they set up a new database. To make it even worse, within a database also the naming convention would start changing over a period of time...it all depends on the mood of DBA while coming up with the name.

In my opinion, naming conventions should be properly defined as a planned effort, it should be well documented and strictly enforced by change control board or a similar regulation authority. Though, I fully agree that there are several scenarios where we can’t expect to have a uniform naming convention across a larger organization. For example, let’s imagine a company being acquired by another company and as a result the IT department of these companies getting merged into a single organization. However, naming conventions should not get destroyed just because of ignorance or negligence of individuals in an organization.


Anonymous said...

Your blog keeps getting better and better! Your older articles are not as good as newer ones you have a lot more creativity and originality now keep it up!

Anonymous said...

Just want to say your article is as astounding. The clarity in your post is just spectacular and i could assume you are an expert on this subject. Well with your permission let me to grab your RSS feed to keep updated with forthcoming post. Thanks a million and please keep up the rewarding work.