Monday, July 2, 2012

How to measure a DBA's workload?


For the past some days, I was thinking of ways to measure the workload of a DBA or team of DBAs in the context of operations. Metrics such as ticket counts, instance or database counts, etc. each have their own imperfections but they all have their advantages, too. Maybe you've got a simple formula that combines multiple metrics?

It all depends on what is the purpose. When planning DBA teams resourcing at one of my previous work place, we were using time because we know how much time we get from the team (minut vacations, overheads, training, internal projects and etc) and how much time we committed to the customers and the forecasts. I didn't see a single or a combination of dimensions that coule reliably give you universal indicator. If you take database count, for example, one customers would take 10 times more efforts per database compare to another but the environment and scope of work are different. It's always comparing apple to oranges.

When I managed a team, I split it into five tracks. Tickets (Issues/Problems), Projects, Learning, Admin and Timeoff. Time and effort expended in hours were the basic metrics. The key is to have a time forecast of loading or utilization based on historical trends of tickets, current projects being worked on, Admin and Timeoff so you can demonstrate the capacity of your team in a systematic way. We could learn a lot from the PM folks. They are always busy and they can prove it!! If you don't have this, you are going to be challenged on what the resources are doing and you will be overloaded. Run your group as a consulting busines and you will find it is the way to go.

I have done a little bit of research on the same before on this subject, please see http://www.khwajaimran.blogspot.com/2012/07/how-many-dbas-do-you-need-to-support.html

Essentially, I found that an average DBA can take care of 40 databases, up to 5 TB. Bigger data is harder to handle, so it isn't universal, but this is a good baseline. I know every shop is different, but I have been to at least a hundred different companies and this seems to hold relatively true, even across industries and vendors. Caveat emptor, there are always exceptions.

Were I measuring a team, I would come up with a baseline somewhat like this, but customized door our environment. Then I would look at whatever factors either increased revenue, decreased cost, or removed risk for the business in a quantitative way. Rank your people based on these, as they are the only things that your executive team cares about. If you can explain your teams actions in these terms you will be much better off.

From my experience as a DBA for over 8 years, I agree with the 40:1 ratio and DB Size but at the same time load on DB is also an major point where managers needs to keep eye's close and estimate head count requirement for respective Technology DBA's. I define DB Load as number of concurrent sessions and transactions. We may also see Data warehouse systems with us, but comparatively less concurrent sessions at the same time DB must be tuned is such a way that daily data loading and other process should complete within the agreed SLA limits so at that time DBA also have load on them. So, when I estimate the DBA requirement, I will mostly take care of 
  1. Technology DBA's 
  2. Number of DB/Instances
  3. Number of Servers 
  4. Each DB Size
  5. Load on Each DB/Instance 
  6. Capabilities of existing staff members 
  7. Expected average completion time for each task for Production DB Support 
  8. Projects In Hand (Other than the regular Production Support) 
  9. Expected Projects
  10. Project time lines 
  11. Staff (Scheduled Leaves & Unscheduled Leaves taken into consideration)
  12. Last but not the least Commercials: This is required for low cost projects and tasks should be done by junior staff and critical deliverable which impact business can be taken care by senior Staff.
Again, this is always tough to document and prove. I think it depends on the priorities from up top. For us, even though DB projects lag or never get done the executives don't really care about that metric. Nor do they care about the number of instances or databases managed per dba. They see the bottom line of everything is fine operationally so why add headcount? What they do value is development for our core product. If something were to slow down development (ie ticket turnaround for new/altered schema) then they would notice and react. Certainly not perfect or all encompassing, but developer tickets is right now the only way I've justified additional headcount.

For me the main measure is % completion within SLA. Of course this assumes that you have agreed SLAs for all your activities. For break/ fix, this is driven by the urgency assigned when the ticket is raised. For scheduled activities (new DB builds etc) this is pre-defined or agreed at the time the ticket is raised. If your team cannot deliver a consistent completion rate within SLA of 97% plus, then you maybe should be looking at the structure of your team.

One thing to also consider, is the type of application(s) running on the databases(s). A database serving up data for a well written application - no matter what size DB or load or anything else - is going to require a lot less time from a DBA compared to a database serving a poorly written application. I've seen many cases where a developers have general database skills and don't always design correctly for the database they're using. Then they release the application and 2 months later frantically phone the DBA team saying the database is slow ... I think we've all been there. The point is, database maintenance efforts have a direct relationship to the quality of the application running with it. Just another thing to consider along with the metrics.

A new question araises here, does this hold true for different types of databases? Teradata? SQL Server? MySQL? Sybase? DB2?

In my previous post, I suggests that SQL Server is somewhat easier to manage (i.e. more instances per DBA) but I think that has a smaller effect than other things. My own experience is that the type of database has less influence on the team's efficiency than other elements. For example, I two teams in comparable technical environments but widely varying efficiencies (60 instances per DBA versus 110). The differentiating factors seem to be non-technical: quality of documentation; maturity of ITIL processes, overhead induced by processes.

Sometimes, I have a different thought again like the number of DBA's depending upon size of the database doesn't help. In the era of big data size doesn't matter, number of tickets, average ticket resolution time, SLA trends etc, DB criticality (Tier 1,2,3) etc are very important when compared to the DB size. I find the size of the database has little to do with the workload of the DBA. We've got a number of smaller databases that require many hours of DBA work simply due to the activity levels of the business. Some of our internal clients are continually making changes or creating short-term applications (some of these are online games or projects related to broadcast programming). These projects usually have very tight deadlines and require considerable attention from our DBAs. 

I don't think you can base a DBAs workload on simply the number of tickets (one ticket might required 10 mins of work while another could be a week). Size of the database isn't necessarily an indicator of workload either. I think the best measurement is going to be the hours required to work on a given project or task. And, even then, the result may be different depending upon the amount of experience a DBA has. 

If you had a ticketing system that is rigorously used both for tracking and time spent and you were able to report on it by database, then that would be a proxy for what you are describing. Excellent.

Once again, there is no unique formula. Two major examples: 

1. If DBA # 1 works 1-hour on a ticket and DBA # 2 works also 1-hour on another ticket: Is the work they do the same? 
Answer: NO. 

2. If DBA # 1 handles daily 20 databases and so does DBA # 2: Are they equally busy? 
Answer: NO.

Measuring only on the basis of ticket is wrong. But if we add the complexity of tickets and client issue which were not resolved by others and one is consuming time of 8 hour to resolve where as deliverable is too much critical at the same time another DBA using 8 hours for small normal daily activity is also having same importance. In short DBA's are valued resource to the company and their respective managers have to decide depending on scenarios and current requirement what is fruitful for him/her.

DBA workload in my opinion should be measured on the complexity type of the tasks and the duration it takes to complete. 

For example, you can have a DBA work all day working on one big project, say 11g R2 Upgrade. 

You can have another DBA do 50 BAU tickets.

DBA workload should be measured on the DBA skill set level & how much Projects/BAU operations he/she can handle. Nowadays, it is not only the technical skills that matter, we have to measure them also on the soft skills they possess. If there is a gap in that area, projects does suffer due to communication issues. 

Long story short, we should categorize DBA workload as Level 1 (Entry Level), Level 2 (Medium), Level 3 (Complex). 

Measuring DBA workload will be kind of like a tuning exercise. You start with a baseline on your team strength and the tasks in hand and based on the upcoming activities, can easily compute whether you can handle the additional workload with the current resource pool (or) if you need to hire additional resources (or) limit projects.

I agree that the DBA workload should be measured as Entry, Medium or Complex. Most ticketing systems do not have such a meter used or if it exists its values are often questionable due to the reason that people who create, assign or approve DBA tickets cannot qualify the complexity of the work. Surprising enough, many generic managers still cannot comprehend the difference between compressing a file with WinZip and compressing a table with Oracle Advanced Compression.

Database Migrations Are Finally Becoming Simpler


Lately I have been getting quite a few inquiries on database migrations asking me about migration approaches, best practices, and tooling. Some of the reasons why companies are looking at migrations are because of staffing issues, scalability and security concerns, and cost-saving strategies. Migrations have always been complex, time consuming, and expensive, because each DBMS has proprietary data structures, data types, and SQL extensions. Once you choose a DBMS, you get locked into its platform, often creating a challenge when looking to migrate. Some companies told me that it took them more than a year to migrate some large and complex applications, which is not surprising. Although, tools and services have helped with migrations in the past, they have not been comprehensive and automated, which would make the process simple. Since, I'm an IT manager in the retail sector in my opinion I would say, “We did not want to spend a million dollars on tools and services just to save a million dollars on our database platform; it just didn’t make sense.”


The good news is that there is a new option that has recently emerged and has been gaining ground. Enter the “database compatibility layer,” which changes the way migrations are done. IBM DB2 9.7 for Linux, UNIX, and Windows offers out-of-the-box compatibility for Oracle’s PL/SQL and Sybase ASE, which allows many applications to run against DB2 without any application code changes. Although the database compatibility layer does not offer 100% compatibility today, based on customer feedback from more than two dozen interviews that I have conducted this year, one can usually expect 90% or more compatibility, requiring only minor code changes. This is huge, which makes migrations simpler, taking only days and weeks as opposed to months and years. IBM jointly developed the database compatibility layer technology with vendors EnterpriseDB (for Oracle compatibility) and ANTs (for Sybase ASE compatibility), who also offer their own database migration solutions.


The database compatibility layer finally opens up the door to more migrations, reducing risk and making them cost effective. I estimate that so far, some 300 companies have migrated their databases using the database compatibility layer, and this is likely to become the standard approach for database migrations in the future.


I would love to hear feedback on database migration experiences using the compatibility layer or other approaches -- things that have worked for you and things that did not. Also, are you looking to migrate anytime soon or holding back because of the risk involved?

How Many DBAs Do You Need To Support Databases?



I frequently get asked the question of how many databases a DBA typically manages. Over the past few years, I have done research on this topic with some of my friends in DBA Faternity, Senior DBA's and DBA Managers, asking them about their ratios and how they improved them. Typically I find that the current industry average is 40 databases to a DBA for large enterprises ($1 billion+ in revenue), with the lowest ratio seen around eight and the highest at 275. So, why this huge variation? There are many factors that I see in customer deployments that contribute to this variation, such as the size of a database, database tools, version of databases, DBA expertise, formalization of database administration, and production versus nonproduction.


This ratio is usually limited by the total size of all databases that a DBA manages. A terabyte-sized database remains difficult to manage compared to a database that's 100 GB in size. Larger databases often require extra tuning, backup, recovery, and upgrade effort. The average database-to-DBA ratio is often constrained by the total size of the databases being managed, which tends to be around five terabytes per DBA. In other words, one DBA can effectively manage 25 databases of 200 GB each or five 1 terabyte databases. And these include production and nonproduction databases.


What are the factors that can help improve the ratio? Cloud, tools, latest DBMS version (automation), and DBMS product used – Oracle, SQL Server, Teradata, DB2, MySQL, or Sybase. Although most DBMS vendors have improved on manageability over the years, based on customer feedback, Microsoft SQL Server tends to have the best ratios.


I believe that although you should try to achieve the 40:1 ratio and the 5 terabyte cap, consider establishing your own baseline based on the database inventory and DBAs and using that as the basis for improving the ratio over time.


I would love to hear about your database-to-DBA ratios and what's working and what's not in your organization when it comes to database administration.

Sunday, July 1, 2012

Is SQL Tuning worth in Production Database Environment?


In my opinion the DB/Instance Tuning is good for all DBA as most of the time we do not have the right to change the SQL in the application as application team is not willing to change he code and its style. Mostly the Dynamic SQL's.


SQL tuning generally delivers the most significant performance gains. Lack of experience or time, inability to change SQL and other barriers are mostly excuses. It might not be easy at time but it needs to be done. I think SQL Tuning is worth in every environment, not just in Production. My point is that there is good and not so good SQL Tuning. And tuning one SQL can un-tune several other SQLs.


SQL tuning is too good while the application is still under development where development team members and database analyst have to know how much data needs to be fetched and time statistics. In most of the production databases environments, I also see that the time spend for a query tuning is too much as compared to tweaking any parameter subject to the resource where he/she is having good knowledge of it. But again SQL Tuning should be taken care by Development team members where they know more about the application behavior. I could also see in my previous teams that they have pressure to release the change request (CR) and in a hurry they release the untuned code and ultimately when it is on production then blame game starts from here on.


In my opinion, whether it's developers of development DBAs or production DBAs, they should be armed to access production environments to assist with SQL diagnostics and tuning. How the changes to SQL are made is subject to your change management and development life cycle.


Again, SQL tuning is one of the biggest performance tweaks you can make. It's true, you don't always have the ability to change SQL if it's dynamically generated, comes from a vendor product, or developers just don't have time to work on it. Nonetheless, I believe it can have some of the biggest impact on databases, production or not, and provide some of the biggest performance improvements you might see. 


Certainly there can be other areas of tuning that can have a significant impact on performance - memory, CPU, buffer space, etc. But, after many years of being a DBA, the thing I've seen more times than anything else is inefficient SQL. It's worth knowing how to find such code and determine if anything can be done to improve it. While you may be limited to throwing hardware at performance issues, that's certainly not the best nor most cost effective way to improve performance. 


In our shop, we can look for areas in which SQL might be improved and if we see changes that need to be made to the SQL statement itself, we work with the development group to make and test such changes before pushing them into production.

TODAY'S IT INFRASTRUCTURE

I found the following once in a presentation. It gives you an idea of "Today's IT Infrastructure". ;-))))



Cheers,
Khwaja Imran
www.khwajaimran.com