Performance Tuning very large databases with OLTP presents a whole new set of problems. There are thousands of articles and tutorials on tuning databases of a generic shape & size . Here are some thoughts on approaching database servers with a high level of transactions , with minimal downtime.
Stability of response times is just as important as improving the amount of queries per second. When faced with millions of queries , a smooth customer experience is critical. Pages hanging because of slow queries is newsworthy in the negative sense.
Minimise variance. When there are thousands of requests per second, more energy should be placed on the extreme slow responses. With this approach you don’t focus on average query response time , but focus on the 5 slowest. Average query response time is important , as it reflects architectural decisions. Also , if you improve the performance of the slowest queries the average response time will improve,
When you’ve identified the 5 slowest queries , analyse them and fix them. The focus turns to QOS (Quality of Service). Typically this term is used on network systems, but applies equally to Database Servers. The main point of QOS is the performance experienced by the users.
What needs to be in place?
Develop a method of measuring how the database servers actually operate. Think in terms of subsecond detail
Measure the waits.SQL Server RUNNABLE – SUSPENDED – RUNNING status model
Find of way of recording and analysing the queries where something is wrong.
Don’t assume anything. I’ve seen all sorts of reasons for slowdown some examples include hourly jobs on ESX hosts grinding systems to a halt, dirty page flushing, statistics issue. All of these problems can also impact when the system is not at full load.
All aspects of the performance should be analysed.Application , DBMS – Data System ,DBMS - Access system,DBMS – Storage System,DBMS Buffer System,OS,Hardware Tuning,Network
Develop a method to observe the performance degradation and how to quickly drill down to the queries for troubleshooting.
Keep the analysis team small.Performance Tuning by Committee - SQL Antipattern #006