Saturday, March 20, 2010

Where a DB Request Spends Time in DB2 9.7?

This title may sound somewhat familiar if you have been following my Blog for over a year. More than a year ago, I wrote about the shortcomings of time spent monitoring in DB2 (see my post “Where a DB Request Spends Time”). Last year when DB2 9.7 was released (around June 2009), I was thrilled to see “Time-spent monitor elements are more comprehensive” under monitoring enhancements in DB2 9.7. Since then I have been thinking about installing DB2 9.7 on my laptop and having a closer look at this enhancement. Finally, I got chance to do so and here I am going to share my findings.

Without spending too much time trying to come up with a more complex workload, I created a table t3 with similar structure as syscat.columns and populated it with rows from syscat.columns. Then I ran a query against this table as shown below. The real elapsed time of this SQL was more than 38 minutes (long enough for my analysis purpose).

Listing 1


Before I start analyzing the time spent by above SQL, let me list some of the queries that I used for this analysis.

Following query gives the client idle wait time, total request time, total wait time and various processing times for each connection in the database.

Listing 2


If you are interested to drill down the total wait time of each connection, following query can be used.

Listing 3


A connection can have more than one Unit of Work (UOW) and if you are interested to monitor the time spent on UOW level, following query can be used.

Listing 4


Again, if you want to drill down the total wait time of each UOW, following query can be used.

Listing 5


If you are interested to monitor time spent on statement level, you can use mon_get_pkg_cache_stmt table function. Following query gives info on time spent by top 10 dynamic SQL statements (sorted by statement execution time).

Listing 6


If you want to drill down the wait time of dynamic SQL statements returned by above query, following query can be used.

Listing 7


In all the above monitoring queries (Listing 2 through 7), I have intentionally left the WHERE clause as you can use the appropriate WHERE clause as per your specific need. Also, the result set of monitoring table functions used in above queries can be altered through the input parameters passed to those table functions. Check the documentation of these table functions for more detail.

In each of the above monitoring queries, SELECT list has indentation and the indented columns give the individual components of the time spent monitoring element selected just before the indented columns. “Hierarchy of time-spent monitor elements” on DB2 Information Center was very helpful while writing above queries. Also, you might have noticed the most of the wait time components are suffixed with “_wait_time” and similarly most of the processing time components are suffixed with “_proc_time”. Any remaining ambiguities get clarified by looking at the documented hierarchy of time-spent monitor elements.

Though, I found the time-spent monitoring elements available at each level (Connection, UOW, Statement) comprehensive, here I am going to present only the UOW level time spent monitoring for the SQL execution in Listing 1.

Here is the output of db2_mon_uow.sql at the end of SQL execution in Listing 1. To make it more readable, I have presented it vertically.

Listing 8


In the above output and in fact in all the columns selected in Listing 2 through 7, the unit of time is millisecond. As you can see total request time (~2242 seconds) is very close to sum of various time components (~2237 seconds) that constitutes the total request time. Even a minor difference (5 seconds or 0.2 %) can be clearly explained by “Other” category as documented under “Hierarchy of time-spent monitor elements” on DB2 Information Center. Though, the real elapsed time as shown in the output of “time” command (in Listing 1) is higher than the total request time reported by DB2, it can be due to the difference in time recording boundaries used by “time” command versus what’s used by DB2.

Similarly, if you want to drill down the total wait time, here is the output from db2_mon_uow_wait.sql

Listing 9


The difference between total wait time and sum of individual wait time components is just 65 milliseconds.

With this time spent monitoring enhancement, now we can easily find out where a DB request is spending maximum time and accordingly address any problem areas in database if present. Kudos to IBM DB2 team for putting this enhancement in DB2 9.7.

Tuesday, February 02, 2010

DBA's Strengths

Did you ever get so engrossed doing something that hours passed by and you did not even realize? It might have happened while watching an interesting movie or playing your favorite video game or similar fun stuff. But did it ever happen while doing your DBA job, the job which pays you salary? I am sure it must have happened unless you are in a completely wrong profession. However, how often does it happen? Do you enjoy doing everything that you have been doing as a DBA? Most likely your answers would be “rarely” and “no”. Imagine how productive you would be if we flip this situation such that most of your job related activities fall under the category which you enjoy doing? Such activities are basically your strengths and you have a natural inclination towards such activities.

Marcus Buckingham, a speaker, trainer, researcher and author has been involved in finding solutions to exactly this type of workplace issues. I got chance to read some of his books (First Break All the Rules, Now Discover Your Strengths, Go Put Your Strengths to Work) and a core finding in his books can be summarized as:

We should try to identify our strengths and focus on our strengths to further grow it rather than spend all our time and effort on our weaknesses. Also, the best managers work hard to understand what their employee’s true talents/strengths are and then shape the job to allow the employee to perform to their maximum. It doesn't pay to focus on people's weaknesses; focus on their strengths.

In his book “Go Put Your Strengths to Work”, he lists following four signs that can help you identify your strengths:
  • When you do it, you feel effective
  • Before you do it, you actively look forward to it
  • While you are doing it, you feel inquisitive and focused
  • After you’ve done it, you feel fulfilled and authentic

If you are interested to learn more about his works, I would recommend reading his books. I would not be able to create the same impact in this blog even if I try to. However, here I would like to analyze his suggestions in context of DBA job and DBA team.

Traditionally, DBA roles have been classified as Application DBA, System DBA, Development DBA, Production DBA etc. These are very broad classifications and DBAs under each category still end up doing many of the activities which are not their strengths and they don’t enjoy doing it. The challenge is how to setup a DBA team where each DBA enjoys doing most of the activities that he is doing. First I will approach this problem from a DBA Manager’s perspective and then from an individual DBA’s perspective.

First of all, each DBA in the team should try to find out if they even like doing DBA job or they should pursue a different profession. After this high level elimination round, the team should be left of only those DBAs who really like being a DBA. Next we have to focus on how to distribute the responsibilities in such a way that most of the DBAs enjoy doing most of their assigned activities. For that, we have to first list down the activities on a more granular level that can really be associated with the likes (strengths) and dislikes (weaknesses) of a DBA. For example, here I have listed some of the DBA activities. They can be made more granular if required. Many more such activities can be added to the list.
  • Analyzing new application requirements and associated database changes
  • Converting a new database logical design to physical design
  • Creating new database
  • Storage capacity planning
  • SQL tuning
  • Configuration parameter tuning
  • Creating database standards and naming conventions
  • Performing database upgrades
  • Physical design of DDL changes
  • Implementing DDL changes
  • Performance diagnosis
  • Setting up performance monitoring
  • Troubleshooting database errors
  • Automating repetitive and mundane tasks
  • Exploring and implementing DBMS new features
  • Managing database scheduled jobs
  • Reacting to database user complains
  • Performing database security review

After listing the activities on such a granular level, each DBA should try to identify top 4 to 6 activities that really show the signs of being their strengths. The activities which show the signs opposite to that of strengths are the weaknesses and each DBA should also identify top 4 to 6 activities that are their weaknesses. In fact more activities will get added to the list while DBAs will be trying to identify their strengths and weaknesses. Once all the DBAs have identified their strengths and weaknesses, the responsibilities should be assigned in such a way that each DBA should be doing more of their strengths and less of their weaknesses.

A common concern about this approach is, if every DBA will be doing the interesting stuff, who will do the boring stuff. In my experience, interesting and boring words are very subjective. Something that is interesting to me does not have to be interesting for others also and vice versa. One can argue, who will find it interesting to resolve an alert in the middle of night. However, we should try to separate the actual activity with the circumstances under which we are doing it. We have to find out if the DBA hates resolving the alert or just because it’s in the middle of night. I agree there are few mundane things that most DBAs will hate doing. Wherever possible such mundane activities should be automated. This way, analyzing each activity carefully and assigning the responsibilities considering strengths and weaknesses of individual DBAs will result in a team where each DBA will do more of what they enjoy doing and will enhance the team productivity.

In reality, going through this whole exercise will not be as simple as it sounds here. For an individual DBA, just identifying his strengths and weaknesses will require lot of introspection over a period of several weeks. Some caution is required while identifying the strengths and weaknesses. Normally, we attach a hypothetical superiority or inferiority with each activity and we desire to do only the superior activities. We think that we will enjoy doing those activities and they are our strengths. However, our strengths are those which pass the litmus test of above mentioned 4 signs and not the activities which we think superior and desire to do it. Strengths should also not be confused with Skills. There are many activities that we are very skilled at because we have done it several times, however they might not necessarily reflect the 4 signs of strengths. This strength based approach of assigning roles and responsibilities within DBA team will require more introspection from each individual DBA as well as more managerial effort from the manager, however it’s worth doing.

If you are a DBA and your manager has no interest in strength based assignment of roles and responsibilities, you can still apply these techniques at least for you (or maybe along with some of your colleagues). You might be thinking that you do what your manager assigns to you. However, consciously or sub- consciously you also play a major role in deciding what activities you do more and what you do less. After identifying your strengths, you can try to volunteer more in the areas of your strengths and wherever possible reduce your involvement in the areas of your weaknesses.

Wednesday, January 13, 2010

Multi-terabyte

After a long gap, I am back again to share some of my thoughts. Leaving my Blog dry for so long makes me feel drained as well. However, I try to console myself thinking now it’s the age of information overload. Also, the growth of community websites in past few years is pouring so much information that it’s difficult for one to cope up with it. Hence, no harm slowing down my contribution to this information overload :)

Even though I was not posting anything on my Blog, I was not completely detached from it. In the meanwhile, I was tackling with the junk comments on my posts :) Yes, you read it correctly... junk comments, not junk emails. Since the email applications already have ways to filter out most of the junk emails, the junkies are targeting blogs now.

One day while looking at my blog, my eyes got stuck at a keyword used in my profile and triggered a chain of thoughts... this word was “multi-terabyte”. I recalled when I had written that profile, I did not mean more than 10 terabytes by it. Just 5 years ago from now, a 10 terabyte data warehouse was considered to be huge and working with such databases used to make me feel privileged. However, now a days I doubt if anyone will consider a smaller than 100 terabyte data warehouse huge. In fact, we have already started talking about petabyte size of data warehouses.

That naturally makes me think, will DB2 be able to support muti-petabyte warehouses in future with its current architecture. If not, what’s there in IBM’s roadmap to support such warehouses? Well, I have not heard much except the empty promises that current scalability features (data partitions, balanced configuration units etc.) should be able to tackle it. Beyond DB2 world, I do see some promising work and an important one to note would be Column-oriented DBMS.

Column-oriented DBMS is not a new concept, though recently it has started getting more traction. If you are hearing about it first time, you can get a quick overview of it on the Wikipedia. Also, you will find tons of articles regarding it if you google it. IBM has been downplaying column-oriented DBMS mentioning that it’s not suitable for all type of workloads in data warehousing environment, the companies who provide Column-oriented DBMS still have a very small footprint in the data warehousing market and so on. These arguments are basically deviating the discussion from the main point.

The important question is, will the majority of workload in data warehousing environment greatly benefit by Column-oriented DBMS approach or not. And the researches on this topic have answered this question affirmatively. Also, I have not come across any clear statement from IBM or IBMers disproving this fact. Well, I am not proposing that IBM should change DB2 from row-oriented to column-oriented or come up with a brand new column-oriented DBMS. However, it can provide an option for column-oriented tables in DB2. May be it will make more sense to have this option on table space level and all the tables under that table space will be stored column-oriented. Please don’t take me wrong, I don’t think this feature is going to be so easy to implement. I can anticipate its far reaching impact on DB2 core engine and optimizer, but so what? When IBM came up with hybrid XML/relational DB2 engine for native support of XML data, it was not a simple feature. If I recall correctly, IBM publicized it to involve 5 years of development time by a large development team.

The next important question is, is it worth investing so much effort in column-oriented table space feature. I think it is... because it will prepare DB2 for the next generation data warehouses. DB2 has been stronger in the data warehousing market compared to OLTP and this feature will help it become even stronger in the data warehousing market. However, I am curious to hear your thoughts on this.