Drop Table
Support Forum for database administrators and web based access to important newsgroups related to databasesI've got a table which is in 3 separate databases on the same server, each with different indexes. The rest of the database is the same (2 were restored from a backup of the first - i could have done 3 tables in the same database but it just seemed easier at the time). All indexes on these 3 tables have been dropped and recreated. The tables are like this fromInt int not null, lastInt int not null, id char(32) not null, name varchar(255) null, keyname varchar(255) null, description varchar(255) null, groupid char(32) NO PRIMARY KEY defined. In db1 I have a non-clustered unique index on lastInt, id (lastInt is pretty specific, the most for a specific lastchanged is 147 - out of a total of 160,000 records - but over 120,000 have less than 30). In db2 I have a CLUSTERED unique index on lastInt,Id AND a non-clustered index on Id,lastInt In db3 I have a non-clustered unique index on lastInt,Id AND a non-clustered index on Id,lastInt Common sense would tell me that db2, using a clustered index would generally perform better for reads. db3 should perform better than db1 for reads because the optimiser has 2 choices of index? I am using maxdop(1), DBCC DROPCLEANBUFFERS, DBCC FREEPROCCACHE and CHECKPOINT before any query to try to ensure a clean query. 1) SELECT * FROM mytable t1 with (nolock) WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock) where t2.id = t1.id AND lastInt > 22881 AND lastInt <= 22900) option (maxdop 1) returns just 137 rows out of 160,000. All dbs perform quickly though execution plan for db3 shows greatest cost by far: 6%, 5%, 89%!! Why? db1: |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:(myTable AS [t1]) WITH PREFETCH) |--Nested Loops(Inner Join, OUTER REFERENCES:([t2].[Id], [Expr1002]) WITH PREFETCH) |--Stream Aggregate(GROUP BY:([t2].[Id]) DEFINE:([Expr1002]=MAX([t2].[lastInt]))) | |--Sort(ORDER BY:([t2].[Id] ASC)) | |--Index Seek(OBJECT:(myTable .[myIndex] AS [t2]), SEEK:([t2].[lastInt] > 22881 AND [t2].[lastInt] <= 22900) OR DERED FORWARD) |--Index Seek(OBJECT:(myTable .[myIndex] AS [t1]), SEEK:([t1].[lastInt]=[Expr1002] AND [t1].[Id]=[t2].& #91;Id]) ORDERED FORWARD) as opposed to: db3: |--Top(1) |--Segment |--Sort(ORDER BY:([t1].[Id] DESC, [t1].[lastInt] DESC)) |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:(myTable AS [t1]) WITH PREFETCH) |--Index Seek(OBJECT:([myTable].[myIndex] AS [t1]), SEEK:([t1].[lastInt] > 22881 AND [t1].[lastInt] < = 22900) ORDERED FORWARD) But the performance for this query is so quick it is more just a matter of interest. However then I try the same query but with 15000 as the lower figure instead of 22881 (returns 14722 rows) db1: |--Hash Match(Inner Join, HASH:([Expr1002], [t2].[Id])=([t1].[lastInt], [t1].[Id]), RESIDUAL:([Expr1002]=[t1].[lastInt] AND [t1].[Id]=[t 2].[Id])) |--Hash Match(Aggregate, HASH:([t2].[Id]), RESIDUAL:([t2].[Id]=[t2].[Id]) DEFINE:([Expr1002]=MAX( 91;t2].[lastInt]))) | |--Index Seek(OBJECT:([myTable].[myIndex] AS [t2]), SEEK:([t2].[lastInt] > 15000 AND [t2].[lastInt] <= 22900) OR DERED FORWARD) |--Table Scan(OBJECT:([myTable] AS [t1])) db2: |--Top(1) |--Segment |--Sort(ORDER BY:([t1].[Id] DESC, [t1].[lastInt] DESC)) |--Clustered Index Seek(OBJECT:([myTable].[myIndex] AS [t1]), SEEK:([t1].[lastInt] > 15000 AND [t1].[lastInt] < = 22900) ORDERED FORWARD) db3: |--Hash Match(Inner Join, HASH:([Expr1002], [t2].[Id])=([t1].[lastInt], [t1].[Id]), RESIDUAL:([Expr1002]=[t1].[lastInt] AND [t1].[Id]=[t 2].[Id])) |--Stream Aggregate(GROUP BY:([t2].[Id]) DEFINE:([Expr1002]=MAX([t2].[lastInt]))) | |--Index Scan(OBJECT:([myTable].[myIndex2] AS [t2]), WHERE:([t2].[lastInt]>15000 AND [t2].[lastInt]<=22900) ORDER ED FORWARD) |--Table Scan(OBJECT:([myTable] AS [t1])) This shows 41% for db1, 20% for db2 and 39% for db3 and yet in reality the time taken is 16 secs, 20 secs and 5 secs. Consistently. So, although there are other factors in play, I have done the test a number of times and the results end up pretty much the same (in fact before I dropped and recreated the indexes db2 with the clustered index was taking 40 seconds but the others were the same - eh?) I don't understand this. First of all, why, in the first query, does db3 have such a high relative query cost. And in the second query, why is db3 by far the best performer? Finally I try this query: SELECT * FROM myTable t1 with (nolock) WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock) where t2.id = t1.id AND lastInt <= 22900) and fromInt = -2147483646 option (maxdop 1) Again the results come back in 18, 9 and 5 seconds respectively with query costs: 38%, 31% and 31% (in this case db1 timing seems quite variable whereas db2 and db3 are pretty consistent no matter what order I do them - in fact before I dropped and recreated the indexes db2 with the clustered index was taking 18 seconds but the others were the same : see above, this is coincidentally (probably not) double the time taken after the index was dropped and recreated, same as query above - eh?). I'm struggling to understand the inner workings here. I assumed that because the query cost was generally the lowest, and a clustered index is usually a good thing, that the results would favour the clustered index but consistently they don't. My estimated row width is 474 (estimated clustered index width = 500). Interestingly myIndex in db1 has the same estimated size as myIndex2 in db3 (i.e 51) yet myIndex2 in db2 has estimated size 77 - why??? myIndex2 in db2 is the same as myIndex2 in db3 on exactly the same data. I have also created statistics on (lastInt, id) and updated existing statistics on (id) and (fromInt) separately in all 3 dbs. I know joins might be better than sub-selects in some cases, but a) I want to understand the basic flaws in my thinking about SQL server and b) because the field list is built dynamically within the real-life app it would be more tricky to prefix all fields with t1. to ensure I got the expected results. Sorry for long post but I thought it was better to have all facts. Thanks for any help Phil
Post Follow-up to this messagedeja > In db2 I have a CLUSTERED unique index on lastInt,Id AND a > non-clustered index on Id,lastInt > In db3 I have a non-clustered unique index on lastInt,Id AND a > non-clustered index on Id,lastInt Having a duplicate indexes is a bad idea > SELECT * FROM mytable t1 with (nolock) > WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock) > where t2.id = t1.id AND lastInt > 22881 AND lastInt <= 22900) > option (maxdop 1) Don't use SELECT * in the production enviroment , it leads sql server to perform bookmark to retrieve the data which my hurt perfomance If you use a range seraching I'd suggest you creating a clustered index on lastInt column <deja@2bytes.co.uk> wrote in message news:1166091799.821002.27650@16g2000cwy.googlegroups.com... > I've got a table which is in 3 separate databases on the same server, > each with different indexes. The rest of the database is the same (2 > were restored from a backup of the first - i could have done 3 tables > in the same database but it just seemed easier at the time). All > indexes on these 3 tables have been dropped and recreated. > > The tables are like this > fromInt int not null, > lastInt int not null, > id char(32) not null, > name varchar(255) null, > keyname varchar(255) null, > description varchar(255) null, > groupid char(32) > > NO PRIMARY KEY defined. > > In db1 I have a non-clustered unique index on lastInt, id (lastInt is > pretty specific, the most for a specific lastchanged is 147 - out of a > total of 160,000 records - but over 120,000 have less than 30). > > In db2 I have a CLUSTERED unique index on lastInt,Id AND a > non-clustered index on Id,lastInt > > In db3 I have a non-clustered unique index on lastInt,Id AND a > non-clustered index on Id,lastInt > > Common sense would tell me that db2, using a clustered index would > generally perform better for reads. db3 should perform better than db1 > for reads because the optimiser has 2 choices of index? > > I am using maxdop(1), DBCC DROPCLEANBUFFERS, DBCC FREEPROCCACHE and > CHECKPOINT before any query to try to ensure a clean query. > > 1) > SELECT * FROM mytable t1 with (nolock) > WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock) > where t2.id = t1.id AND lastInt > 22881 AND lastInt <= 22900) > option (maxdop 1) > > returns just 137 rows out of 160,000. All dbs perform quickly though > execution plan for db3 shows greatest cost by far: 6%, 5%, 89%!! Why? > > db1: > |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:(myTable AS [t1]) > WITH PREFETCH) > |--Nested Loops(Inner Join, OUTER REFERENCES:([t2].[Id], > [Expr1002]) WITH PREFETCH) > |--Stream Aggregate(GROUP BY:([t2].[Id]) > DEFINE:([Expr1002]=MAX([t2].[lastInt]))) > | |--Sort(ORDER BY:([t2].[Id] ASC)) > | |--Index Seek(OBJECT:(myTable .[myIndex] AS [t 2]), > SEEK:([t2].[lastInt] > 22881 AND [t2].[lastInt] <= 22900) ORDERED > FORWARD) > |--Index Seek(OBJECT:(myTable .[myIndex] AS [t1]), > SEEK:([t1].[lastInt]=[Expr1002] AND [t1].[Id]=[t2] .[Id]) ORDERED > FORWARD) > > as opposed to: > > db3: > |--Top(1) > |--Segment > |--Sort(ORDER BY:([t1].[Id] DESC, [t1].[lastInt ] DESC)) > |--Bookmark Lookup(BOOKMARK:([Bmk1000]), > OBJECT:(myTable AS [t1]) WITH PREFETCH) > |--Index Seek(OBJECT:([myTable].[myIndex] AS > [t1]), SEEK:([t1].[lastInt] > 22881 AND [t1].[lastInt] <= 22900) > ORDERED FORWARD) > > But the performance for this query is so quick it is more just a matter > of interest. However then I try the same query but with 15000 as the > lower figure instead of 22881 (returns 14722 rows) > > db1: > |--Hash Match(Inner Join, HASH:([Expr1002], > [t2].[Id])=([t1].[lastInt], [t1].[Id]), > RESIDUAL:([Expr1002]=[t1].[lastInt] AND [t1].[Id]=[ ;t2].[Id])) > |--Hash Match(Aggregate, HASH:([t2].[Id]), > RESIDUAL:([t2].[Id]=[t2].[Id]) DEFINE:([Expr1002]=MAX( [t2].[lastInt]))) > | |--Index Seek(OBJECT:([myTable].[myIndex] AS [t2]), > SEEK:([t2].[lastInt] > 15000 AND [t2].[lastInt] <= 22900) ORDERED > FORWARD) > |--Table Scan(OBJECT:([myTable] AS [t1])) > > db2: > |--Top(1) > |--Segment > |--Sort(ORDER BY:([t1].[Id] DESC, [t1].[lastInt ] DESC)) > |--Clustered Index Seek(OBJECT:([myTable].[myIndex ] AS > [t1]), SEEK:([t1].[lastInt] > 15000 AND [t1].[lastInt] <= 22900) > ORDERED FORWARD) > > db3: > |--Hash Match(Inner Join, HASH:([Expr1002], > [t2].[Id])=([t1].[lastInt], [t1].[Id]), > RESIDUAL:([Expr1002]=[t1].[lastInt] AND [t1].[Id]=[ ;t2].[Id])) > |--Stream Aggregate(GROUP BY:([t2].[Id]) > DEFINE:([Expr1002]=MAX([t2].[lastInt]))) > | |--Index Scan(OBJECT:([myTable].[myIndex2] AS [t2]) , > WHERE:([t2].[lastInt]>15000 AND [t2].[lastInt]<=22900) ORD ERED FORWARD) > |--Table Scan(OBJECT:([myTable] AS [t1])) > > This shows 41% for db1, 20% for db2 and 39% for db3 and yet in reality > the time taken is 16 secs, 20 secs and 5 secs. Consistently. So, > although there are other factors in play, I have done the test a number > of times and the results end up pretty much the same (in fact before I > dropped and recreated the indexes db2 with the clustered index was > taking 40 seconds but the others were the same - eh?) > > I don't understand this. First of all, why, in the first query, does > db3 have such a high relative query cost. And in the second query, why > is db3 by far the best performer? > > > Finally I try this query: > > SELECT * FROM myTable t1 with (nolock) > WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock) > where t2.id = t1.id AND lastInt <= 22900) and fromInt = -2147483646 > option (maxdop 1) > > Again the results come back in 18, 9 and 5 seconds respectively with > query costs: 38%, 31% and 31% (in this case db1 timing seems quite > variable whereas db2 and db3 are pretty consistent no matter what order > I do them - in fact before I dropped and recreated the indexes db2 with > the clustered index was taking 18 seconds but the others were the same > : see above, this is coincidentally (probably not) double the time > taken after the index was dropped and recreated, same as query above - > eh?). > > I'm struggling to understand the inner workings here. I assumed that > because the query cost was generally the lowest, and a clustered index > is usually a good thing, that the results would favour the clustered > index but consistently they don't. My estimated row width is 474 > (estimated clustered index width = 500). Interestingly myIndex in db1 > has the same estimated size as myIndex2 in db3 (i.e 51) yet myIndex2 in > db2 has estimated size 77 - why??? myIndex2 in db2 is the same as > myIndex2 in db3 on exactly the same data. > > I have also created statistics on (lastInt, id) and updated existing > statistics on (id) and (fromInt) separately in all 3 dbs. > > I know joins might be better than sub-selects in some cases, but a) I > want to understand the basic flaws in my thinking about SQL server and > b) because the field list is built dynamically within the real-life app > it would be more tricky to prefix all fields with t1. to ensure I got > the expected results. > > Sorry for long post but I thought it was better to have all facts. > Thanks for any help > > Phil >
Post Follow-up to this message> Having a duplicate indexes is a bad idea well you say that, but in fact if you look at the figures it performs much quicker..... > Don't use SELECT * in the production enviroment , it leads sql server to > perform bookmark to retrieve the data which my hurt perfomance > If you use a range seraching I'd suggest you creating a clustered index on > lastInt column I have got a clustered index on lastInt column in db2 and it didn't perform very well. Did you read the stats? I also changed my *, to explicitly select all columns by name - made no difference at all. BTW I am on SQL 2000 Service Pack 4. Uri Dimant wrote: > deja > > Having a duplicate indexes is a bad idea > > > > Don't use SELECT * in the production enviroment , it leads sql server to > perform bookmark to retrieve the data which my hurt perfomance > If you use a range seraching I'd suggest you creating a clustered index on > lastInt column > > > > <deja@2bytes.co.uk> wrote in message > news:1166091799.821002.27650@16g2000cwy.googlegroups.com...
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread