Drop Table

Support Forum for database administrators and web based access to important newsgroups related to databases
Register on Database Support Forum Edit your profileCalendarFind other Database Support forum membersFrequently Asked QuestionsSearch this forum -> 
For Database admins: Free Database-related Magazines Now Free shipping to Texas


Post New Thread










Thread
Author

who understands SQL Server? --- agghhhhhhh......
I've got a table which is in 3 separate databases on the same server,
each with different indexes. The rest of the database is the same (2
were restored from a backup of the first - i could have done 3 tables
in the same database but it just seemed easier at the time). All
indexes on these 3 tables have been dropped and recreated.

The tables are like this
fromInt   int not null,
lastInt    int not null,
id          char(32) not null,
name    varchar(255) null,
keyname  varchar(255) null,
description varchar(255) null,
groupid     char(32)

NO PRIMARY KEY defined.

In db1 I have a non-clustered unique index on lastInt, id  (lastInt is
pretty specific, the most for a specific lastchanged is 147 - out of a
total of 160,000 records - but over 120,000 have less than 30).

In db2 I have a CLUSTERED unique index on lastInt,Id AND a
non-clustered index on Id,lastInt

In db3 I have a non-clustered unique index on lastInt,Id AND a
non-clustered index on Id,lastInt

Common sense would tell me that db2, using a clustered index would
generally perform better for reads. db3 should perform better than db1
for reads because the optimiser has 2 choices of index?

I am using maxdop(1), DBCC DROPCLEANBUFFERS, DBCC FREEPROCCACHE and
CHECKPOINT before any query to try to ensure a clean query.

1)
SELECT * FROM mytable t1 with (nolock)
WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock)
where t2.id = t1.id AND lastInt > 22881 AND lastInt <= 22900)
option (maxdop 1)

returns just 137 rows out of 160,000. All dbs perform quickly though
execution plan for db3 shows greatest cost by far: 6%, 5%, 89%!! Why?

db1:
|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:(myTable AS [t1])
WITH PREFETCH)
|--Nested Loops(Inner Join, OUTER REFERENCES:([t2].[Id],
[Expr1002]) WITH PREFETCH)
|--Stream Aggregate(GROUP BY:([t2].[Id])
DEFINE:([Expr1002]=MAX([t2].[lastInt])))
|    |--Sort(ORDER BY:([t2].[Id] ASC))
|         |--Index  Seek(OBJECT:(myTable
.[myIndex] AS [t2]),
SEEK:([t2].[lastInt] > 22881 AND [t2].[lastInt] <= 22900) OR
DERED
FORWARD)
|--Index  Seek(OBJECT:(myTable
.[myIndex] AS [t1]),
SEEK:([t1].[lastInt]=[Expr1002] AND [t1].[Id]=[t2].&
#91;Id]) ORDERED
FORWARD)

as opposed to:

db3:
|--Top(1)
|--Segment
|--Sort(ORDER BY:([t1].[Id] DESC, [t1].[lastInt] DESC))
|--Bookmark Lookup(BOOKMARK:([Bmk1000]),
OBJECT:(myTable AS [t1]) WITH PREFETCH)
|--Index Seek(OBJECT:([myTable].[myIndex] AS
[t1]), SEEK:([t1].[lastInt] > 22881 AND [t1].[lastInt] <
= 22900)
ORDERED FORWARD)

But the performance for this query is so quick it is more just a matter
of interest. However then I try the same query but with 15000 as the
lower figure instead of 22881 (returns 14722 rows)

db1:
|--Hash Match(Inner Join, HASH:([Expr1002],
[t2].[Id])=([t1].[lastInt], [t1].[Id]),
RESIDUAL:([Expr1002]=[t1].[lastInt] AND [t1].[Id]=[t
2].[Id]))
|--Hash Match(Aggregate, HASH:([t2].[Id]),
RESIDUAL:([t2].[Id]=[t2].[Id]) DEFINE:([Expr1002]=MAX(&#
91;t2].[lastInt])))
|    |--Index Seek(OBJECT:([myTable].[myIndex] AS [t2]),
SEEK:([t2].[lastInt] > 15000 AND [t2].[lastInt] <= 22900) OR
DERED
FORWARD)
|--Table Scan(OBJECT:([myTable] AS [t1]))

db2:
|--Top(1)
|--Segment
|--Sort(ORDER BY:([t1].[Id] DESC, [t1].[lastInt] DESC))
|--Clustered Index Seek(OBJECT:([myTable].[myIndex] AS
[t1]), SEEK:([t1].[lastInt] > 15000 AND [t1].[lastInt] <
= 22900)
ORDERED FORWARD)

db3:
|--Hash Match(Inner Join, HASH:([Expr1002],
[t2].[Id])=([t1].[lastInt], [t1].[Id]),
RESIDUAL:([Expr1002]=[t1].[lastInt] AND [t1].[Id]=[t
2].[Id]))
|--Stream Aggregate(GROUP BY:([t2].[Id])
DEFINE:([Expr1002]=MAX([t2].[lastInt])))
|    |--Index Scan(OBJECT:([myTable].[myIndex2] AS [t2]),
WHERE:([t2].[lastInt]>15000 AND [t2].[lastInt]<=22900) ORDER
ED FORWARD)
|--Table Scan(OBJECT:([myTable] AS [t1]))

This shows 41% for db1, 20% for db2 and 39% for db3 and yet in reality
the time taken is 16 secs, 20 secs and 5 secs. Consistently. So,
although there are other factors in play, I have done the test a number
of times and the results end up pretty much the same (in fact before I
dropped and recreated the indexes db2 with the clustered index was
taking 40 seconds but the others were the same - eh?)

I don't understand this. First of all, why, in the first query, does
db3 have such a high relative query cost. And in the second query, why
is db3 by far the best performer?


Finally I try this query:

SELECT * FROM myTable t1 with (nolock)
WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock)
where t2.id = t1.id AND lastInt <= 22900) and fromInt = -2147483646
option (maxdop 1)

Again the results come back in 18, 9 and 5 seconds respectively with
query costs: 38%, 31% and 31% (in this case db1 timing seems quite
variable whereas db2 and db3 are pretty consistent no matter what order
I do them - in fact before I dropped and recreated the indexes db2 with
the clustered index was taking 18 seconds but the others were the same
: see above, this is coincidentally (probably not) double the time
taken after the index was dropped and recreated, same as query above -
eh?).

I'm struggling to understand the inner workings here. I assumed that
because the query cost was generally the lowest, and a clustered index
is usually a good thing, that the results would favour the clustered
index but consistently they don't. My estimated row width is 474
(estimated clustered index width = 500). Interestingly myIndex in db1
has the same estimated size as myIndex2 in db3 (i.e 51) yet myIndex2 in
db2 has estimated size 77 - why??? myIndex2 in db2 is the same as
myIndex2 in db3 on exactly the same data.

I have also created statistics on (lastInt, id) and updated existing
statistics on (id) and (fromInt) separately in all 3 dbs.

I know joins might be better than sub-selects in some cases, but a) I
want to understand the basic flaws in my thinking about SQL server and
b) because the field list is built dynamically within the real-life app
it would be more tricky to prefix all fields with t1. to ensure I got
the expected results.

Sorry for long post but I thought it was better to have all facts.
Thanks for any help

Phil


Report this thread to moderator Post Follow-up to this message
Old Post
deja@2bytes.co.uk
12-14-06 10:25 AM


Re: who understands SQL Server? --- agghhhhhhh......
deja
> In db2 I have a CLUSTERED unique index on lastInt,Id AND a
> non-clustered index on Id,lastInt
> In db3 I have a non-clustered unique index on lastInt,Id AND a
> non-clustered index on Id,lastInt

Having a duplicate indexes  is a bad idea


> SELECT * FROM mytable t1 with (nolock)
> WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock)
> where t2.id = t1.id AND lastInt > 22881 AND lastInt <= 22900)
> option (maxdop 1)

Don't use SELECT * in the production enviroment , it leads sql server to
perform bookmark to retrieve the data which my hurt perfomance
If you use a range seraching I'd suggest you creating a clustered index on
lastInt  column



<deja@2bytes.co.uk> wrote in message
news:1166091799.821002.27650@16g2000cwy.googlegroups.com...
> I've got a table which is in 3 separate databases on the same server,
> each with different indexes. The rest of the database is the same (2
> were restored from a backup of the first - i could have done 3 tables
> in the same database but it just seemed easier at the time). All
> indexes on these 3 tables have been dropped and recreated.
>
> The tables are like this
> fromInt   int not null,
> lastInt    int not null,
> id          char(32) not null,
> name    varchar(255) null,
> keyname  varchar(255) null,
> description varchar(255) null,
> groupid     char(32)
>
> NO PRIMARY KEY defined.
>
> In db1 I have a non-clustered unique index on lastInt, id  (lastInt is
> pretty specific, the most for a specific lastchanged is 147 - out of a
> total of 160,000 records - but over 120,000 have less than 30).
>
> In db2 I have a CLUSTERED unique index on lastInt,Id AND a
> non-clustered index on Id,lastInt
>
> In db3 I have a non-clustered unique index on lastInt,Id AND a
> non-clustered index on Id,lastInt
>
> Common sense would tell me that db2, using a clustered index would
> generally perform better for reads. db3 should perform better than db1
> for reads because the optimiser has 2 choices of index?
>
> I am using maxdop(1), DBCC DROPCLEANBUFFERS, DBCC FREEPROCCACHE and
> CHECKPOINT before any query to try to ensure a clean query.
>
> 1)
> SELECT * FROM mytable t1 with (nolock)
> WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock)
> where t2.id = t1.id AND lastInt > 22881 AND lastInt <= 22900)
> option (maxdop 1)
>
> returns just 137 rows out of 160,000. All dbs perform quickly though
> execution plan for db3 shows greatest cost by far: 6%, 5%, 89%!! Why?
>
> db1:
>  |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:(myTable AS [t1])
> WITH PREFETCH)
>       |--Nested Loops(Inner Join, OUTER REFERENCES:([t2].[Id],
> [Expr1002]) WITH PREFETCH)
>            |--Stream Aggregate(GROUP BY:([t2].[Id])
> DEFINE:([Expr1002]=MAX([t2].[lastInt])))
>            |    |--Sort(ORDER BY:([t2].[Id] ASC))
>            |         |--Index  Seek(OBJECT:(myTable
.[myIndex] AS [t
2]),
> SEEK:([t2].[lastInt] > 22881 AND [t2].[lastInt] <= 22900) 
ORDERED
> FORWARD)
>            |--Index  Seek(OBJECT:(myTable
.[myIndex] AS [t1]),
> SEEK:([t1].[lastInt]=[Expr1002] AND [t1].[Id]=[t2]
.[Id]) ORDERED
> FORWARD)
>
> as opposed to:
>
> db3:
>  |--Top(1)
>       |--Segment
>            |--Sort(ORDER BY:([t1].[Id] DESC, [t1].[lastInt
] DESC))
>                 |--Bookmark Lookup(BOOKMARK:([Bmk1000]),
> OBJECT:(myTable AS [t1]) WITH PREFETCH)
>                      |--Index Seek(OBJECT:([myTable].[myIndex] AS
> [t1]), SEEK:([t1].[lastInt] > 22881 AND [t1].[lastInt]
 <= 22900)
> ORDERED FORWARD)
>
> But the performance for this query is so quick it is more just a matter
> of interest. However then I try the same query but with 15000 as the
> lower figure instead of 22881 (returns 14722 rows)
>
> db1:
>  |--Hash Match(Inner Join, HASH:([Expr1002],
> [t2].[Id])=([t1].[lastInt], [t1].[Id]),
> RESIDUAL:([Expr1002]=[t1].[lastInt] AND [t1].[Id]=[
;t2].[Id]))
>       |--Hash Match(Aggregate, HASH:([t2].[Id]),
> RESIDUAL:([t2].[Id]=[t2].[Id]) DEFINE:([Expr1002]=MAX(
[t2].[lastInt])))
>       |    |--Index Seek(OBJECT:([myTable].[myIndex] AS [t2]),
> SEEK:([t2].[lastInt] > 15000 AND [t2].[lastInt] <= 22900) 
ORDERED
> FORWARD)
>       |--Table Scan(OBJECT:([myTable] AS [t1]))
>
> db2:
>  |--Top(1)
>       |--Segment
>            |--Sort(ORDER BY:([t1].[Id] DESC, [t1].[lastInt
] DESC))
>                 |--Clustered Index Seek(OBJECT:([myTable].[myIndex
] AS
> [t1]), SEEK:([t1].[lastInt] > 15000 AND [t1].[lastInt]
 <= 22900)
> ORDERED FORWARD)
>
> db3:
>  |--Hash Match(Inner Join, HASH:([Expr1002],
> [t2].[Id])=([t1].[lastInt], [t1].[Id]),
> RESIDUAL:([Expr1002]=[t1].[lastInt] AND [t1].[Id]=[
;t2].[Id]))
>       |--Stream Aggregate(GROUP BY:([t2].[Id])
> DEFINE:([Expr1002]=MAX([t2].[lastInt])))
>       |    |--Index Scan(OBJECT:([myTable].[myIndex2] AS [t2])
,
> WHERE:([t2].[lastInt]>15000 AND [t2].[lastInt]<=22900) ORD
ERED FORWARD)
>       |--Table Scan(OBJECT:([myTable] AS [t1]))
>
> This shows 41% for db1, 20% for db2 and 39% for db3 and yet in reality
> the time taken is 16 secs, 20 secs and 5 secs. Consistently. So,
> although there are other factors in play, I have done the test a number
> of times and the results end up pretty much the same (in fact before I
> dropped and recreated the indexes db2 with the clustered index was
> taking 40 seconds but the others were the same - eh?)
>
> I don't understand this. First of all, why, in the first query, does
> db3 have such a high relative query cost. And in the second query, why
> is db3 by far the best performer?
>
>
> Finally I try this query:
>
> SELECT * FROM myTable t1 with (nolock)
> WHERE lastInt = ( select max(lastInt) from myTable t2 with (nolock)
> where t2.id = t1.id AND lastInt <= 22900) and fromInt = -2147483646
> option (maxdop 1)
>
> Again the results come back in 18, 9 and 5 seconds respectively with
> query costs: 38%, 31% and 31% (in this case db1 timing seems quite
> variable whereas db2 and db3 are pretty consistent no matter what order
> I do them - in fact before I dropped and recreated the indexes db2 with
> the clustered index was taking 18 seconds but the others were the same
> : see above, this is coincidentally (probably not) double the time
> taken after the index was dropped and recreated, same as query above -
> eh?).
>
> I'm struggling to understand the inner workings here. I assumed that
> because the query cost was generally the lowest, and a clustered index
> is usually a good thing, that the results would favour the clustered
> index but consistently they don't. My estimated row width is 474
> (estimated clustered index width = 500). Interestingly myIndex in db1
> has the same estimated size as myIndex2 in db3 (i.e 51) yet myIndex2 in
> db2 has estimated size 77 - why??? myIndex2 in db2 is the same as
> myIndex2 in db3 on exactly the same data.
>
> I have also created statistics on (lastInt, id) and updated existing
> statistics on (id) and (fromInt) separately in all 3 dbs.
>
> I know joins might be better than sub-selects in some cases, but a) I
> want to understand the basic flaws in my thinking about SQL server and
> b) because the field list is built dynamically within the real-life app
> it would be more tricky to prefix all fields with t1. to ensure I got
> the expected results.
>
> Sorry for long post but I thought it was better to have all facts.
> Thanks for any help
>
> Phil
>



Report this thread to moderator Post Follow-up to this message
Old Post
Uri Dimant
12-14-06 10:25 AM


Re: who understands SQL Server? --- agghhhhhhh......
> Having a duplicate indexes  is a bad idea

well you say that, but in fact if you look at the figures it performs
much quicker.....


> Don't use SELECT * in the production enviroment , it leads sql server to
> perform bookmark to retrieve the data which my hurt perfomance
> If you use a range seraching I'd suggest you creating a clustered index on
> lastInt  column

I have got a clustered index on lastInt column in db2 and it didn't
perform very well. Did you read the stats?
I also changed my *, to explicitly select all columns by name - made no
difference at all. BTW I am on SQL 2000 Service Pack 4.



Uri Dimant wrote:

> deja 
>
> Having a duplicate indexes  is a bad idea
>
> 
>
> Don't use SELECT * in the production enviroment , it leads sql server to
> perform bookmark to retrieve the data which my hurt perfomance
> If you use a range seraching I'd suggest you creating a clustered index on
> lastInt  column
>
>
>
> <deja@2bytes.co.uk> wrote in message
> news:1166091799.821002.27650@16g2000cwy.googlegroups.com... 


Report this thread to moderator Post Follow-up to this message
Old Post
deja@2bytes.co.uk
12-14-06 10:25 AM


Sponsored Links





Last Thread Next Thread
Post New Thread

MS SQL Server archive

Show a Printable Version Email This Page to Someone! Receive updates to this thread
Microsoft SQL Server
Access database support
PostgreSQL Replication
SQL Server ODBC
FoxPro Support
PostgreSQL pgAdmin
SQL Server Clustering
MySQL ODBC
Web Applications with dBASE
SQL Server CE
MySQL++
Sybase Database Support
MS SQL Full Text Search
PostgreSQL Administration
SQL Anywhere support
DB2 UDB Database
Paradox Database Support
Filemaker Database
Berkley DB
SQL 2000/2000i database
ASE Database
Forum Jump:
All times are GMT. The time now is 10:07 PM.

 
Mobile devices forum | Database support forum archive




Copyrights DropTable.com Database Support Forum 2004 - 2006