Question: I’m thinking of placing a large data column as an INCLUDE into an index , a nvarchar(4000). Will it have a negative impact on SQL Server performance?
Basically, is it a good idea? I’ve read it’s not a good idea.
Answer: I can’t advise you to never implement such an action, but in general I’d avoid placing such a large data column as an INCLUDE of an index.
Some of the well documented benefits of including non key columns in the INCLUDE statement is data types not allowed as key columns are allowed and the INCLUCDE columns are disregarded by the db engine when looking at limits of index key columns and key sizes. Read more on SQL server – What is a SQL index key?
It is also possible to implement techniques allowing you to remove key lookups with INCLUDE, read more on How to Remove a Key Lookup with an INCLUDE
But these benefits need some consideration when deciding on which non key columns to add
The more rows you can maintain on a page ,the less pages need to be accessed , leading to greater SQL Server performance. So adding a large column will force more pages to be read, which leads to more pages read if there are scans – this can slow the query response time right down.
That’s the bad news, but the good news is that it is possible to monitor the queries , record the amount of reads, and assess for yourself the genuine impact of adding a large data column to your index. Try & measure the read and write ratios on the column. If it’s a column you’re constantly adding rows – then this will probably lead to decreased query response times , but if it’s a column with more reads or updates than it may not such an impact.
There are plenty of built in SQL Server metrics to use in monitoring – for example, STATISTICS IO . STATISTICS IO will give you information about disk activity related to the query.