Drop Table
Support Forum for database administrators and web based access to important newsgroups related to databasesI am reading "SQL Server Query Performance Tuning Distilled", on page 104 it talks about one of the index design recommendations which is to choose the column that has very high selectivity of values instead of a column that has very few selectivity of values. My question is if I have currently indexes on my tables that have 1, 2, 3, 4, ... values only on thousands of rows, are these nonclustered indexes pretty much useless indexes that I should get rid of? And I know that pretty much the number of selectivity values will always remain very low. Thank you
Post Follow-up to this messageserge (sergea@nospam.ehmail.com) writes: > I am reading "SQL Server Query Performance Tuning Distilled", > on page 104 it talks about one of the index design recommendations > which is to choose the column that has very high selectivity of values > instead of a column that has very few selectivity of values. > > My question is if I have currently indexes on my tables that have > 1, 2, 3, 4, ... values only on thousands of rows, are these nonclustered > indexes pretty much useless indexes that I should get rid of? > > And I know that pretty much the number of selectivity values will > always remain very low. As always in the database world, it depends. An index on a bit column sound like a bad idea in general, but consider this query: SELECT ... FROM tbl WHERE unprocessed = convert(bit, 0) Typically in such a table, there will be only a small number of unprocessed rows, so the column is very selective for unprocessed = 0, and you almost need an index on unprocessed here. (And for the index to be useful, you need the convert as well, a subtlety with SQL Server data-type precedence.) It also matters here whether the index is clustered or not. To continue with the bit column, a non-clustered index on a bit column with a 50/50 split is useless (almost see below), where as a clustered index actually reduces the scan to only half of the table. Take this a little further and consider a column with ten different values with equal distribution. The non- clustered index is still not much of use, where as a clustered index reduces the reads for a query like: SELECT ... FROM tbl WHERE col = 'G' AND ... to 10% of a full scan. The reason the non-clustered index is useless, is because the optimizer will find it more expensive to seek the index and then look up rows from the data pages. But all this changes if all you read is columns from the index. Consider the bit column with a 50/50 split, and assume that you often need to run SELECT bitcol, COUNT(*) FROM tbl GROUP BY bitcol The non-clustered index is now a covering index and very useful. So bottom line is: good indexes are indexes that are used. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pr...oads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodin...ions/books.mspx
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread