Monday, February 16, 2009

DB2 and EC2

Though irrelevant, there is a similarity of alphabetical pattern between DB2 and EC2. E comes just after D and C comes just after B in the alphabet. May be this similarity aroused my interest in EC2... J Also, IBM and Amazon have partnered together to deliver DB2 in Amazon EC2 (Elastic Compute Cloud) environment.

Cloud computing, the biggest buzz word in current date, is no more perceived as mere hype. It has already started helping small startup companies. There are new businesses emerging on top of cloud computing infrastructure. Gradually more and more companies will start embracing the cloud computing.

Though I have been hearing about cloud computing for over a year, I never bothered to think about how similar or different it would be managing DB2 databases hosted in cloud computing environment. When I read the IBM’s announcement (actually followed the link from Anant Jhingran’s blog post), it crossed my mind that very soon we DBA’s may have to deal with the DB2 instances hosted in the cloud computing environment. Suddenly, several questions started arising in my mind. Will it be accessible through web browser only or we can still use our favorite SSH client or Remote Desktop (in case of Windows)? How will be the database physical design? Can we also access these virtual servers through VPN and so on...?

Out of curiosity, I thought of signing up for the Amazon’s service and get a feel of EC2. Looking at their pricing, it did not appear to cost much just to get a feel of it. On a second thought, I decided to go through their documentation first. The documentation is concise and to the point. It helped to clear the clouds over my understanding of Elastic Compute Cloud.

Here is a very high level summary of what I learned about EC2 from their documentation.
  1. The virtual servers in EC2 environment are basically the running instances of AMI (Amazon Machine Image). AMI contains all the software, including OS and associated configuration settings, applications, libraries etc.
  2. There are already many existing AMI (free as well as paid). Also, new AMI can be created from scratch or based on an existing AMI. However, new AMI has to be saved on Amazon S3 (Simple Storage Service) and there is a charge for this storage.
  3. There are different types of instances based on CPU, Memory, Storage and I/O requirements and accordingly Amazon has different pricing for each type of instance.
  4. AMI instances are not persistent. You lose all the changes when you shutdown your OS or terminate the AMI instance.
  5. For all the online data that requires persistence (for example, a DB2 database), they have a different solution. It’s called Elastic Block Store (EBS). EBS is storage volume that can be attached to the host (running instances of AMI) and it can either be used as raw device or a file system can be created on it.
  6. Snapshots of EBS can be created on the Amazon S3 (Simple Storage Service). EBS snapshot can be used to launch new EBS volumes using the previous snapshot as the starting point of those new volumes.
  7. Static IP addresses can also be assigned to instances. They are called Elastic IP address.
  8. We can access the servers hosted in EC2 environment using SSH client or Remote Desktop.

I like the ease of provisioning, server downsizing and upsizing available in EC2 environment. Also, it appears to me that set up and maintenance of disaster recovery and business continuity solutions will be relatively easier in EC2 environment. Testing the additional computing power (capacity planning) for increased workload will be much easier and cost effective. We don’t have to make permanent investment in computing power and then later realize that the investment did not solve the capacity problem.

Having said all the good things about EC2, I do agree that we will come across several unpleasant aspects of it when we start using it. I found some of the experiences discussed on Colin Percival’s blog.

In general, one of the big concerns about cloud computing is company policies and security issues related with storing company’s data outside company network. Though EC2 appears to have reasonable security provision, I am not an expert on network security and hence I will abstain from making any comment on security. However, if cloud computing provides or will provide robust security, I would anticipate revisions in company policies over a period of time to adapt to the new computing environment.


Jeffrey Benner said...

Good post. It is encouraging to find DB2 people moving so quickly to explore the EC2 possibilities opening up. I wonder if the DB2 cloud will allow companies to leapfrog past DPF. The cloud seems to offer a simpler means of extending very large databases without the complexities of partitioning.

Brian said...

Thanks for the great post. In addition to security concerns, as a DBA I also wonder about the cloud computing vendor's role in performance problems and if/how they would be held accountable for breach of things like performance-related service level agreements or availability.

I look forward to your session on automation at this year's IDUG North America.

Radhesh Kumar said...

Thanks Jeffrey and Brian for sharing your thoughts. While talking about performance, it reminds me some of my test DB2 instances on VMWare virtual servers. The performance sucks even though enough resources has been allocated to these virtual servers.

In my opinion, the newer server virtualization technologies still need to improve a lot in order to provide a reasonable performance.