24 July,2012 by Jack Vamvas
Maintaining service uptime is the holy grail for Production DBAs. Once Service Level Agreements (SLAs), Recovery Point Objectives (RPO) , Recovery Time Objectives (RTO) are agreed, the DBA swings into action. The DBA uses technologies and implements processes to achieve the objectives. That’s just the start . Maintaining the systems is the tricky part.
The DBA morning routine is sacred . Once I arrive at work the steps are:
2) Any Fatal page\email alerts?
5) SQL Server Logs Critical in the last 24 hrs (includes Backup failures)
6) SQL Server Login Failed Critical in the last 24 hrs
7) DB2 diagnostics in the last 24 hrs (error and severe)
8) Performance Issues
All the reports are based on exceptions and thresholds. I don’t want to see reports of systems running well. I expect them to run well.
1) Are there any production systems not available? If so, Fix it
2) Are there any Dev\Test\QA servers not available? If so, Fix it
3) Are there any issues which might impact RPO,RTO and SLA? Fix Them