12 July,2011 by Jack Vamvas
I’m a DBA in a data centre with 400+ database servers , supported by 3 DBAs. Administering T-SQL best practises is one of the challenges.
I’d like to review and sign off all T-SQL code , prior to production release. Due to priorities , it’s not possible to review all T-SQL code. Developers use Dev and Test environments for assessing their code.
Because of resource constraints, my focus is on performance tuning and optimizing the database server environment. If a developer experiences less than expected code performance, they approach me to discuss and review the code.
The application mix on the environment is OLTP, DSS, modelling, in-house development and third – party.
Changing third – party T-SQL code is difficult due to licensing and support agreements. T-SQL code not performing optimally is identified , documented and sent back to the third party. In the meanwhile , I’m dealing with a low performance code.
I use various tactics while waiting for the third party to reply . One of those is filtered statistics on SQL Server 2008 databases
When a query is executed and statistics are missing , SQL Server indicates the statistics as missing.
SQL Server attempts to generate the statistics next time the query is executed.
The DBA has a number of options : AUTO_CREATE_STATISTICS , AUTO_UPDATE_STATISTICS , CREATE STATISTICS. These options don’t always solve the problem.
Filtered statistics create statistics for a sub set of data in a column(s). The filtered statistics use a predicate to define a subset of data. This predicate influences the cardinality estimate on the query plan,
One of the reasons filtered indexes are useful is filtered statistics. The filtered statistics cover just the rows in the filtered index , not the full table.
An example , of the work around T-SQL code to achieve accurate cardinality estimates
---the key columns to focus on are Estimate Rows
CREATE TABLE Musician (musician_id INT, name nvarchar(50)); GO CREATE TABLE Age(age_id INT, age_value int); GO CREATE CLUSTERED INDEX ci1 ON Musician(musician_id); GO CREATE INDEX nc_Musician_name on Musician(name) GO CREATE STATISTICS st_musician_id_name on Musician (musician_id, name); GO CREATE CLUSTERED INDEX ci2 ON Age(age_id, age_value); GO INSERT INTO Musician VALUES (0, 'Prince'); INSERT INTO Musician VALUES(1, 'Sting'); INSERT INTO Musician VALUES(2, 'The Edge'); go INSERT INTO Age values(1,1); DECLARE @a INT; SET @a = 1; WHILE @a < 10000 BEGIN INSERT INTO Age VALUES (0, @a) SET @a = @a + 1; END GO ---manual statistics update UPDATE STATISTICS Musician WITH FULLSCAN UPDATE STATISTICS Age WITH FULLSCAN GO SET STATISTICS PROFILE ON; go SELECT [name] FROM Musician AS m INNER JOIN Age AS a on m.musician_id = a.age_id WHERE name='Prince'; SELECT [name] FROM Musician AS m INNER JOIN Age AS a on m.musician_id = a.age_id WHERE name='Sting'; GO SET STATISTICS PROFILE OFF; GO
---On the Prince search I expect 9999 rows : Underestimating
---On the Sting search I expect 1 : Overestimatinng
CREATE STATISTICS Musician_st_musician_id ON Musician(musician_id) WHERE name = 'Prince'; go CREATE STATISTICS Musician_st_musician_idB ON Musician(musician_id) WHERE name = 'Sting'; GO SET STATISTICS PROFILE ON; GO SELECT [name] FROM Musician AS m INNER JOIN Age AS a ON m.musician_id = a.age_id WHERE name='Prince'; SELECT [name] FROM Musician AS m INNER JOIN Age AS a ON m.musician_id = a.age_id WHERE name='Sting'; GO SET STATISTICS PROFILE ON;
---That’s better. the key columns to focus on are Estimate Rows
---On the Prince search I expect 9999 rows : Correct Estimate
---On the Sting search I expect 1 : Correct Estimate
Filtered Indexes are useful. I’ve found them useful when:
1) The query uses defined data subsets and consistent search criteria
2) Consistent search criteria.