26 August,2012 by Tom Collins
SQL Server monitoring is a powerful tool for the DBA. A situation arises ,for example, SQL Server Instance is unavailable, an alert is sent to the relevant groups. Equally as important are Daily Reports. The reasoning behind Daily Reports are two . 1) They’re a proactive way of discovering hints that could point to an imminent outage 2) Daily Reports can absorb and report on low level situations , whereas , if they were included in monitoring , the DBAs would receive alerts all day.
The trick behind Daily Reports is to identify the main sources of messages and then actually look at the Reports. It’s easy to become information blind – with so many emails, documents and reports arriving at a busy DBAs desk. The Daily Reports I find useful across :SQL Server, DB2 and Oracle are:
1) Daily Health Check – comprised of different customised details
3) Database Server Error Log files – such as SQL Server Error Logs and db2diag
4) OS Event Logs – for Windows and Linux
Staying on the theme of Daily Reporting. When I spot an issue , which could be anything from a suspicious Kerberos warning through to a unresponsive disk , the procedure is to collate as much information as possible – from OS Logs , SQL Server Logs and other metrics such as DMV , traces or custom scripts. I contact the relevant groups to discuss and decide on some escalation. My favourite ones are when it’s an imminent sever outage, in other words we’ve spotted some warnings and are taking quick action to avert. Maybe that is the super hero complex – based on years of movie watching – but it is satisfying to identify these problems and clear them up.
If an outage occurs , there’s normally a debrief . In attendance will be application owners, DBAs and Operations staff . Quite often a range of views are expressed . Over the last year – I commonly hear the term “self – healing” . There is an idea amongst management that all problems can be predicted , and scripted responses can be prepared. This should be the aim and there are situations where it’s suitable. But would you risk data to self healing ? For example, if a scheduled BACKUP fails and as a response there’s another execution of the script , can the script take into account there is high IO pressure on the server – which is causing all traffic to struggle.
I know this seems obvious , but once again , there was a project delay this week due to lack of detailed planning. Normally I create a very detailed DBA plan – including every step required to migrate a server – including : Task Detail,Start Time, End Time , Dependancies etc. On this occasion due to Production issues , we skipped the detail . “It’s all in the detail!” is my mantra.
SQL Server – Dynamic RAM and Static RAM
SQL Server – DNS NSLOOKUP and Resolve IP
SQL Server - Predict SQL BACKUP DATABASE finish time with sys.dm_exec_requests
SQL Server - SQL Server Restart with Powershell
SQL Server – Manage SQL Data and SQL Log File Locations
SQL Server – Delete SQL Logon TRIGGER syntax
DB2 – How to pscp from the command line
This is only a preview. Your comment has not yet been posted.
As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.
Having trouble reading this image? View an alternate.
Posted by: |