Drop Table
Support Forum for database administrators and web based access to important newsgroups related to databasesI am currently having a debate with a developer. Initially all the tables in his database had only a primary key (PK), none of which were clustered, so all the tables were heaps. We have already seen some great improvements after I recommended additional indexes, however I am having a hard time convincing the developer that it is best for most tables in the database to have a clustered index - especially the tables with large numbers of records. His viewpoint is that this is not necesary because Clustered index are used when queries return a range of values ( or sorted, grouped, ... ), which he says is not the case for these tables. Who is right?
Post Follow-up to this messageI like to almost always have a clustered index on a table. At the very least, proper clustered indexes can help eliminate Sort Operations and Bookmark Lookups you'll often see in Query Execution plans when using non-clustered indexes. "DBA72" <DBA72@discussions.microsoft.com> wrote in message news:71BEC856-DEB6-46A0-B107- C57833D6101A@microso ft.com... >I am currently having a debate with a developer. Initially all the tables >in > his database had only a primary key (PK), none of which were clustered, so > all the tables were heaps. > > We have already seen some great improvements after I recommended > additional > indexes, however I am having a hard time convincing the developer that it > is > best for most tables in the database to have a clustered index - > especially > the tables with large numbers of records. > > His viewpoint is that this is not necesary because Clustered index are > used > when queries return a range of values ( or sorted, grouped, ... ), which > he > says is not the case for these tables. > > Who is right?
Post Follow-up to this messageSee if this helps: Cluster That Index! http://www.quest-pipelines.com/newsletter-v4/1103_B.htm AMB "DBA72" wrote: > I am currently having a debate with a developer. Initially all the tables in > his database had only a primary key (PK), none of which were clustered, so > all the tables were heaps. > > We have already seen some great improvements after I recommended additiona l > indexes, however I am having a hard time convincing the developer that it is > best for most tables in the database to have a clustered index - especiall y > the tables with large numbers of records. > > His viewpoint is that this is not necesary because Clustered index are use d > when queries return a range of values ( or sorted, grouped, ... ), which h e > says is not the case for these tables. > > Who is right?
Post Follow-up to this messageClustered indrex is good in the situation, as you stated, where the query runs agains a range of values of the clustered-index component columns. Also when you sort by the cluster key, it speeds up the process. Where it is not good is that you just put on a cluster on the key, for no particular reason. This way you waste the capability of clustering with a purpose. When you later on need a cluster, you will have to rebuild your index. Where it is bad is that you do not have any sequential query and you have a lot of non-sequential inserts. It will generate a lot of page splits (or a lot wasted space + unnecessary disk I/Os), whereas a heap will just append the new inserts at the bottom of the page/extent. hth "DBA72" <DBA72@discussions.microsoft.com> wrote in message news:71BEC856-DEB6-46A0-B107- C57833D6101A@microso ft.com... >I am currently having a debate with a developer. Initially all the tables >in > his database had only a primary key (PK), none of which were clustered, so > all the tables were heaps. > > We have already seen some great improvements after I recommended > additional > indexes, however I am having a hard time convincing the developer that it > is > best for most tables in the database to have a clustered index - > especially > the tables with large numbers of records. > > His viewpoint is that this is not necesary because Clustered index are > used > when queries return a range of values ( or sorted, grouped, ... ), which > he > says is not the case for these tables. > > Who is right?
Post Follow-up to this messageAsk him what the _dis_advantage is of having a clustered index, since he seems to be against them (I bet he can't come up with anything meaningful). Anything a non-clustered index can do, a clustered index can do at least as well, so if you only have one index on a table, make it clustered. On top of that, having a clustered index on a table will actually make your table perform better for updates, inserts and deletes. This is because the non-clustered index uses the clustered index to find the row, where when there is no clustered index the non-clustered indexes will use the actual physical row location. That means that if a row moves, for example because the page it is on gets split because it is too full, the non-clustered indexes that point to that row have to be updated when the table is a heap. But if the table has a clustered index, the key for the clustered index stays the same, so the non-clustered indexes don't have to be updated. Also, a row pointer in a non-clustered index on a heap takes up 10 bytes, a key for a clustered index can take up as little as 5 bytes. You can of course have only one clustered index on a table, that why you have to think about which index to make clustered. -- Jacco Schalkwijk SQL Server MVP "DBA72" <DBA72@discussions.microsoft.com> wrote in message news:71BEC856-DEB6-46A0-B107- C57833D6101A@microso ft.com... >I am currently having a debate with a developer. Initially all the tables >in > his database had only a primary key (PK), none of which were clustered, so > all the tables were heaps. > > We have already seen some great improvements after I recommended > additional > indexes, however I am having a hard time convincing the developer that it > is > best for most tables in the database to have a clustered index - > especially > the tables with large numbers of records. > > His viewpoint is that this is not necesary because Clustered index are > used > when queries return a range of values ( or sorted, grouped, ... ), which > he > says is not the case for these tables. > > Who is right?
Post Follow-up to this message"sql question" <abc@xyz.com> wrote in message news:O8s%23UYpRFHA.3628@TK2MSFTNGP12.phx.gbl... > Clustered indrex is good in the situation, as you stated, where the query > runs agains a range of values of the clustered-index component columns. > Also when you sort by the cluster key, it speeds up the process. It can also eliminate sort operations internal to the query execution plan (these don't necessarily guarantee the sorted order of returned results though). > Where it is not good is that you just put on a cluster on the key, for no > particular reason. This way you waste the capability of clustering with a > purpose. When you later on need a cluster, you will have to rebuild your > index. There's rarely a reason to not use a clustered index. You are right though, careful consideration should be made to ensure that you include appropriate columns in your clustered index so you won't need a "do-over" later. > Where it is bad is that you do not have any sequential query and you have > a lot of non-sequential inserts. It will generate a lot of page splits > (or a lot wasted space + unnecessary disk I/Os), whereas a heap will just > append the new inserts at the bottom of the page/extent. This is, of course, versus the additional cost of all those extra bookmark lookups (you thought page splits generated a lot of unecessary disk I/Os!) and internal sorts (lots of extra processing) due to not using a clustered index. You can limit the impact of page splitting by modifying fill factor and padding when you create the clustered index and by running DBCC REINDEX - especially after adding lots of rows to your table.
Post Follow-up to this message>> Where it is bad is that you do not have any sequential query and you have > > This is, of course, versus the additional cost of all those extra bookmark > lookups (you thought page splits generated a lot of unecessary disk I/Os!) > and internal sorts (lots of extra processing) due to not using a clustered > index. You can limit the impact of page splitting by modifying fill > factor and padding when you create the clustered index and by running DBCC > REINDEX - especially after adding lots of rows to your table. "you thought page splits generated a lot of unecessary disk I/Os!" No it's not that. Page split is one thing and (wasted space + unnecessary disk I/Os) is the other. Either you generate page split, or, if you have had the page splits, then you are looking at a low density population of the pages, which results in wasted space and unnecessary disk I/Os.
Post Follow-up to this message> index. You can limit the impact of page splitting by modifying fill > factor and padding when you create the clustered index and by running DBCC > REINDEX - especially after adding lots of rows to your table. And keep in mind, that the larger the fill factor is, the more likely you are going to have page splits and the smaller the fill factor is, the more empty space you are going to occupy. Either way you are not immune to the split or wasted space + added disk I/O problem. Also remember, the DBCC REINDEX is an offline process. You do not have your table available while running this process. Between your running this process, you remain subjectd to the wasted space + added disk I/O problem.
Post Follow-up to this messageOn Thu, 21 Apr 2005 18:20:17 +0100, Jacco Schalkwijk wrote: (snip) >That means that if a row moves, for example because >the page it is on gets split because it is too full, the non-clustered >indexes that point to that row have to be updated when the table is a heap. Hi Jacco, I don't think the above is correct. First, there are no page splits in a heap. Individual rows can move (when varying length columns are updated with longer values), but in that case, a forwarding pointer is inserted at the original location, so that nonclustered indexes don't need to be changed. Of course, the downside is an extra page read: after finding the page, a forwarding pointer instead of a row is encountered and the page for the new location needs to be read. Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address)
Post Follow-up to this messageOn Thu, 21 Apr 2005 09:47:12 -0700, DBA72 wrote: >I am currently having a debate with a developer. Initially all the tables i n >his database had only a primary key (PK), none of which were clustered, so >all the tables were heaps. > >We have already seen some great improvements after I recommended additional >indexes, however I am having a hard time convincing the developer that it i s >best for most tables in the database to have a clustered index - especially >the tables with large numbers of records. > >His viewpoint is that this is not necesary because Clustered index are used >when queries return a range of values ( or sorted, grouped, ... ), which he >says is not the case for these tables. > >Who is right? Hi DBA72, Best way to find out is to test it. Load the current tables on a test server, fill them with enough data for meaningful speed comparisons and check how long some typical actions take. Change one of the indexes for each table to clustered, then repeat the same tests. Don't forget to clear out the data cache before each test. Clear out the procedure cache as well, to force recompiles that will take the new indexes as well (maybe not necessary, but it won't hurt). CHECKPOINT DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address)
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread