09 July,2012 by Jack Vamvas
Minimising exposure to critical systems downtime risk is a hot topic ,amongst Enterprise Architects and DBAs. It constantly amazes me how senior management focus on expensive solutions , when the basics aren’t covered. For example, have the bugs been fixed? Is documentation in place? Are there regular reviews? Security Audits? Data partitioning?
Of course , solutions focused on SQL Server Redundancy and High Availability, large scale Disaster Recovery , shortening Backup windows , Fast Recovery require analysis and planning – with budget implications. Some IT departments thrive on deployment. It keeps Architects and Engineers busy – but the biggest cost can result from downtime due to poor quality database server maintenance.
I spent time this week dealing with Operations tasks , exacerbated by lack of documentation awareness. The vision of outsourcing Operational DBA tasks has good potential – but the reality can be different. Defined DBA tasks, supported by accurate documentation – can be managed by Operational staff . Therefore , to make the outsourcing effective , the DBA team must : a) create accurate operational documentation b) create a handover process to Operational\outsource staff , ensuring they are familiar with the documentation c) a process to feedback and improve the documentation
An ESX server crashed , hosting 9 VMS with SQL Server installed. The VM administrator recovered the VMs onto a standby host in the cluster . The problem was the DBA team were not made aware of the situation until the daily Reports arrived. Firstly, the monitoring did not raise any alerts – possible due to a recent upgrade of the monitoring software . The deployment was rushed with inadequate testing of every situation. The database servers were not Tier 1 – but this outage highlights the importance of testing all known situations prior to going live . There was management pressure – but as the DBA it’s vital to not approve changes until standards are met.
I’m giving extra focus to Failed Logon attempts. Failed Logon attempts are a good source of clues for security checks and application issues. During investigation this week , I found attempted Linked Server connections from Development to Production environments . I tracked down the source and modified the Linked Server. I’ve introduced a daily Report to check Failed Logon attempts.
Lock pages in memory privileges granted as Group Policy is a debating point amongst SQL Server DBAs particularly on a Windows 2008 \ SQL Server 2008 64 bit system . The Lock pages in memory privileges prevent data paging from physical memory to virtual memory . I enable Lock Pages in Memory via Powershell,PSEexec and lock pages in memory on fresh installations. According to SQL Server Books Online “Locking pages in memory is not required on 64-bit operating systems”. The Lock memory in pages privileges doesn’t compensate for inaccurate Max Server memory settings . I have seen Out Of Memory errors once switching it on . Take the whole system into account – including the OS and other instances