Drop Table
Support Forum for database administrators and web based access to important newsgroups related to databasesThis is a question concerning query optimisation. Sorry if it's a bit long, but thanks to anyone who has the patience to help - This is my first post here... If I have two tables: 'tblContact' and 'tblCategory' where categories are like: Code Name 010101 Short 010102 Fat 010103 Stupid The junction table 'tblConCat' has fields CctConID, CctCatCode to tell me which contacts have which category codes. These are nchar(6) fields, if it makes any difference. If I need to find all people who are short, fat and stupid I can see two ways: Solution One: SELECT tblContact.* FROM tblContact WHERE ConID IN (SELECT CctConID FROM tblConCat WHERE CctCatCode='010101') AND ConID IN (SELECT CctConID FROM tblConCat WHERE CctCatCode='010102') AND ConID IN (SELECT CctConID FROM tblConCat WHERE CctCatCode='010103') Solution Two: Build a helper table which contains the codes I'm looking for SELECT tblContact.* FROM tblContact WHERE ConID IN (SELECT CctConID FROM tblConCat INNER JOIN tblHelper ON tblConCat.CctCatCode = tblHelper.HlpCatCode GROUP BY CctConID HAVING Count(*)=3) I have tried them both out and although I looked at the query analyzer it provided more information than I knew what to do with. In practise they both provide similar working times (I wait about a second) but I thought the first looked rather inefficient and thought my helper table might help. There are about 30,000 contact records, 180 category records and 120,000 junction table records. All I am looking for comments on any pros and cons of these two approaches - does one look that bad? The database was migrated from Access where the helper table really did help, but using SQL Server I might not need it. Thanks again, if you got this far!
Post Follow-up to this message> I have tried them both out and although I looked at the query analyzer it > provided more information than I knew what to do with. In practise they > both provide similar working times You can compare the estimated costs by running your queries at the same time in Query Analyzer with the show execution plan option on. However, I would still choose the query with the lowest actual execution time. All things being equal, select the simplest approach. BTW, make sure you have a unique index on tblConCat (CctConID and CctCatCode). A primary key, unique constraint or unique index will provide this. Untested Solution Three (there are many more): SELECT tblContact.* FROM tblContact c JOIN tblConCat cc1 ON c.ConID = cc1.CctConID AND CctCatCode='010101' JOIN tblConCat cc2 ON c.ConID = cc2.CctConID AND CctCatCode='010102' JOIN tblConCat cc3 ON c.ConID = cc3.CctConID AND CctCatCode='010103' -- Hope this helps. Dan Guzman SQL Server MVP "Justin Hoffman" <j@b.com> wrote in message news:d4am7c$sus$1@nw rdmz03.dmz.ncs.ea.ibs-infra.bt.com... > > This is a question concerning query optimisation. Sorry if it's a bit > long, but thanks to anyone who has the patience to help - This is my first > post here... > > > If I have two tables: 'tblContact' and 'tblCategory' where categories are > like: > Code Name > 010101 Short > 010102 Fat > 010103 Stupid > > The junction table 'tblConCat' has fields CctConID, CctCatCode to tell me > which contacts have which category codes. These are nchar(6) fields, if > it makes any difference. > > If I need to find all people who are short, fat and stupid I can see two > ways: > > Solution One: > SELECT tblContact.* FROM tblContact WHERE > ConID IN (SELECT CctConID FROM tblConCat WHERE CctCatCode='010101') AND > ConID IN (SELECT CctConID FROM tblConCat WHERE CctCatCode='010102') AND > ConID IN (SELECT CctConID FROM tblConCat WHERE CctCatCode='010103') > > > Solution Two: > Build a helper table which contains the codes I'm looking for > > SELECT tblContact.* FROM tblContact > WHERE ConID IN > (SELECT CctConID > FROM tblConCat INNER JOIN tblHelper > ON tblConCat.CctCatCode = tblHelper.HlpCatCode > GROUP BY CctConID HAVING Count(*)=3) > > > I have tried them both out and although I looked at the query analyzer it > provided more information than I knew what to do with. In practise they > both provide similar working times (I wait about a second) but I thought > the first looked rather inefficient and thought my helper table might > help. There are about 30,000 contact records, 180 category records and > 120,000 junction table records. > > All I am looking for comments on any pros and cons of these two > approaches - does one look that bad? The database was migrated from > Access where the helper table really did help, but using SQL Server I > might not need it. > > Thanks again, if you got this far! > >
Post Follow-up to this message"Dan Guzman" <guzmanda@nospam-online.sbcglobal.net> wrote in message news:vb6ae.1041$yd7.788@newssvr11.news.prodigy.com... > > You can compare the estimated costs by running your queries at the same > time in Query Analyzer with the show execution plan option on. However, I > would still choose the query with the lowest actual execution time. All > things being equal, select the simplest approach. > > BTW, make sure you have a unique index on tblConCat (CctConID and > CctCatCode). A primary key, unique constraint or unique index will > provide this. > > Untested Solution Three (there are many more): > > SELECT tblContact.* > FROM tblContact c > JOIN tblConCat cc1 ON > c.ConID = cc1.CctConID AND > CctCatCode='010101' > JOIN tblConCat cc2 ON > c.ConID = cc2.CctConID AND > CctCatCode='010102' > JOIN tblConCat cc3 ON > c.ConID = cc3.CctConID AND > CctCatCode='010103' > > -- > Hope this helps. > > Dan Guzman > SQL Server MVP > > "Justin Hoffman" <j@b.com> wrote in message > news:d4am7c$sus$1@nw rdmz03.dmz.ncs.ea.ibs-infra.bt.com... Hi Dan Thank you for your comments. The SQL needed a couple of adjustments: the CctCatCode fields needed a table prefix to make them unambiguous and also (probable typo) I need (select c.*) rather than (select tblContact.*) So the final version is: SELECT c.* FROM tblContact c JOIN tblConCat cc1 ON c.ConID = cc1.CctConID AND cc1.CctCatCode='010101' JOIN tblConCat cc2 ON c.ConID = cc2.CctConID AND cc2.CctCatCode='010102' JOIN tblConCat cc3 ON c.ConID = cc3.CctConID AND cc3.CctCatCode='010103' Anyway, the result again is pretty fast and I guess I am wasting my time building a helper table if it doesn't noticeably increase the speed but does increase the complexity (the helper tables need to be dynamically created in a multi-user environment). I looked at the execution plans, but it gives so much information (cost, subtree cost, etc) that it leaves a simple person like me somewhat lost. What I would like to know is 'which is better: A or B?' Even when I run them, I am not sure how to compare the actual performance. Is there one key number to look at? Perhaps 'Cumulative client processing time'? Perhaps I should try to read up on this. (PS I do have that index you mentioned)
Post Follow-up to this message> What I would like to know is 'which is better: A or B?' Even when I run > them, I am not sure how to compare the actual performance. Is there one > key number to look at? I usually compare actual query duration in a controlled test environment with no other activity and clean cache. CHECKPOINT DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE GO SELECT GETDATE() AS StartTime GO --execute first query GO SELECT GETDATE() AS EndTime GO CHECKPOINT DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE GO SELECT GETDATE() AS StartTime GO --execute second query GO SELECT GETDATE() AS EndTime GO -- Hope this helps. Dan Guzman SQL Server MVP "Justin Hoffman" <j@b.com> wrote in message news:d4auds$fe6$1@nw rdmz01.dmz.ncs.ea.ibs-infra.bt.com... > > "Dan Guzman" <guzmanda@nospam-online.sbcglobal.net> wrote in message > news:vb6ae.1041$yd7.788@newssvr11.news.prodigy.com... > > > > Hi Dan > Thank you for your comments. The SQL needed a couple of adjustments: the > CctCatCode fields needed a table prefix to make them unambiguous and also > (probable typo) I need (select c.*) rather than (select tblContact.*) So > the final version is: > > SELECT c.* > FROM tblContact c > JOIN tblConCat cc1 ON > c.ConID = cc1.CctConID AND > cc1.CctCatCode='010101' > JOIN tblConCat cc2 ON > c.ConID = cc2.CctConID AND > cc2.CctCatCode='010102' > JOIN tblConCat cc3 ON > c.ConID = cc3.CctConID AND > cc3.CctCatCode='010103' > > Anyway, the result again is pretty fast and I guess I am wasting my time > building a helper table if it doesn't noticeably increase the speed but > does increase the complexity (the helper tables need to be dynamically > created in a multi-user environment). > I looked at the execution plans, but it gives so much information (cost, > subtree cost, etc) that it leaves a simple person like me somewhat lost. > What I would like to know is 'which is better: A or B?' Even when I run > them, I am not sure how to compare the actual performance. Is there one > key number to look at? Perhaps 'Cumulative client processing time'? > Perhaps I should try to read up on this. > (PS I do have that index you mentioned) >
Post Follow-up to this message"Dan Guzman" <guzmanda@nospam-online.sbcglobal.net> wrote in message news:Cc7ae.1061$yd7.363@newssvr11.news.prodigy.com... > > I usually compare actual query duration in a controlled test environment > with no other activity and clean cache. > > CHECKPOINT > DBCC DROPCLEANBUFFERS > DBCC FREEPROCCACHE > GO > SELECT GETDATE() AS StartTime > GO > --execute first query > GO > SELECT GETDATE() AS EndTime > GO > > CHECKPOINT > DBCC DROPCLEANBUFFERS > DBCC FREEPROCCACHE > GO > SELECT GETDATE() AS StartTime > GO > --execute second query > GO > SELECT GETDATE() AS EndTime > GO > > -- > Hope this helps. > > Dan Guzman > SQL Server MVP Thanks, Dan. I'll put that in my 'useful things to know' file.
Post Follow-up to this messageThis is a "relational division"; it is one of Codd's original operations. You might also want to look up the proper way to name data elements, so you will not have those silly "tbl-" prefixes and multiple suffixes. And rows are not records. Relational division is one of the eight basic operations in Codd's relational algebra. The idea is that a divisor table is used to partition a dividend table and produce a quotient or results table. The quotient table is made up of those values of one column for which a second column had all of the values in the divisor. This is easier to explain with an example. We have a table of pilots and the planes they can fly (dividend); we have a table of planes in the hangar (divisor); we want the names of the pilots who can fly every plane (quotient) in the hangar. To get this result, we divide the PilotSkills table by the planes in the hangar. CREATE TABLE PilotSkills (pilot CHAR(15) NOT NULL, plane CHAR(15) NOT NULL, PRIMARY KEY (pilot, plane)); PilotSkills pilot plane ==================== ===== 'Celko' 'Piper Cub' 'Higgins' 'B-52 Bomber' 'Higgins' 'F-14 Fighter' 'Higgins' 'Piper Cub' 'Jones' 'B-52 Bomber' 'Jones' 'F-14 Fighter' 'Smith' 'B-1 Bomber' 'Smith' 'B-52 Bomber' 'Smith' 'F-14 Fighter' 'Wilson' 'B-1 Bomber' 'Wilson' 'B-52 Bomber' 'Wilson' 'F-14 Fighter' 'Wilson' 'F-17 Fighter' CREATE TABLE Hangar (plane CHAR(15) NOT NULL PRIMARY KEY); Hangar plane ============= 'B-1 Bomber' 'B-52 Bomber' 'F-14 Fighter' PilotSkills DIVIDED BY Hangar pilot ==================== ========= 'Smith' 'Wilson' In this example, Smith and Wilson are the two pilots who can fly everything in the hangar. Notice that Higgins and Celko know how to fly a Piper Cub, but we don't have one right now. In Codd's original definition of relational division, having more rows than are called for is not a problem. The important characteristic of a relational division is that the CROSS JOIN (Cartesian product) of the divisor and the quotient produces a valid subset of rows from the dividend. This is where the name comes from, since the CROSS JOIN acts like a multiplication operator. Division with a Remainder There are two kinds of relational division. Division with a remainder allows the dividend table to have more values than the divisor, which was Codd's original definition. For example, if a pilot can fly more planes than just those we have in the hangar, this is fine with us. The query can be written in SQL-89 as SELECT DISTINCT pilot FROM PilotSkills AS PS1 WHERE NOT EXISTS (SELECT * FROM Hangar WHERE NOT EXISTS (SELECT * FROM PilotSkills AS PS2 WHERE (PS1.pilot = PS2.pilot) AND (PS2.plane = Hangar.plane))); The quickest way to explain what is happening in this query is to imagine an old World War II movie where a cocky pilot has just walked into the hangar, looked over the fleet, and announced, "There ain't no plane in this hangar that I can't fly!" We are finding the pilots for whom there does not exist a plane in the hangar for which they have no skills. The use of the NOT EXISTS() predicates is for speed. Most SQL systems will look up a value in an index rather than scan the whole table. The SELECT * clause lets the query optimizer choose the column to use when looking for the index. This query for relational division was made popular by Chris Date in his textbooks, but it is not the only method nor always the fastest. Another version of the division can be written so as to avoid three levels of nesting. While it is not original with me, I have made it popular in my books. SELECT PS1.pilot FROM PilotSkills AS PS1, Hangar AS H1 WHERE PS1.plane = H1.plane GROUP BY PS1.pilot HAVING COUNT(PS1.plane) = (SELECT COUNT(plane) FROM Hangar); There is a serious difference in the two methods. Burn down the hangar, so that the divisor is empty. Because of the NOT EXISTS() predicates in Date's query, all pilots are returned from a division by an empty set. Because of the COUNT() functions in my query, no pilots are returned from a division by an empty set. In the sixth edition of his book, INTRODUCTION TO DATABASE SYSTEMS (Addison-Wesley; 1995 ;ISBN 0-201-82458-2), Chris Date defined another operator (DIVIDEBY ... PER) which produces the same results as my query, but with more complexity. Exact Division The second kind of relational division is exact relational division. The dividend table must match exactly to the values of the divisor without any extra values. SELECT PS1.pilot FROM PilotSkills AS PS1 LEFT OUTER JOIN Hangar AS H1 ON PS1.plane = H1.plane GROUP BY PS1.pilot HAVING COUNT(PS1.plane) = (SELECT COUNT(plane) FROM Hangar) AND COUNT(H1.plane) = (SELECT COUNT(plane) FROM Hangar); This says that a pilot must have the same number of certificates as there planes in the hangar and these certificates all match to a plane in the hangar, not something else. The "something else" is shown by a created NULL from the LEFT OUTER JOIN. Please do not make the mistake of trying to reduce the HAVING clause with a little algebra to: HAVING COUNT(PS1.plane) = COUNT(H1.plane) because it does not work; it will tell you that the hangar has (n) planes in it and the pilot is certified for (n) planes, but not that those two sets of planes are equal to each other. Note on Performance The nested EXISTS() predicates version of relational division was made popular by Chris Date's textbooks, while the author is associated with popularizing the COUNT(*) version of relational division. The Winter 1996 edition of DB2 ON-LINE MAGAZINE (http://www.db2mag.com/96011ar:htm) had an article entitled "Powerful SQL:Beyond the Basics" by Sheryl Larsen which gave the results of testing both methods. Her conclusion for DB2 was that the nested EXISTS() version is better when the quotient has less than 25% of the dividend table's rows and the COUNT(*) version is better when the quotient is more than 25% of the dividend table.
Post Follow-up to this message"--CELKO--" <jcelko212@earthlink.net> wrote in message news:1114182329.179802.208710@g14g2000cwa.googlegroups.com... > This is a "relational division"; it is one of Codd's original > operations. You might also want to look up the proper way to name data > elements, so you will not have those silly "tbl-" prefixes and multiple > suffixes. And rows are not records. > > Relational division is one of the eight basic operations in Codd's > relational algebra. The idea is that a divisor table is used to > partition a dividend table and produce a quotient or results table. > The quotient table is made up of those values of one column for which a > second column had all of the values in the divisor. > > This is easier to explain with an example. We have a table of pilots > and the planes they can fly (dividend); we have a table of planes in > the hangar (divisor); we want the names of the pilots who can fly every > plane (quotient) in the hangar. To get this result, we divide the > PilotSkills table by the planes in the hangar. > > CREATE TABLE PilotSkills > (pilot CHAR(15) NOT NULL, > plane CHAR(15) NOT NULL, > PRIMARY KEY (pilot, plane)); > > PilotSkills > pilot plane > ==================== ===== > 'Celko' 'Piper Cub' > 'Higgins' 'B-52 Bomber' > 'Higgins' 'F-14 Fighter' > 'Higgins' 'Piper Cub' > 'Jones' 'B-52 Bomber' > 'Jones' 'F-14 Fighter' > 'Smith' 'B-1 Bomber' > 'Smith' 'B-52 Bomber' > 'Smith' 'F-14 Fighter' > 'Wilson' 'B-1 Bomber' > 'Wilson' 'B-52 Bomber' > 'Wilson' 'F-14 Fighter' > 'Wilson' 'F-17 Fighter' > > CREATE TABLE Hangar > (plane CHAR(15) NOT NULL PRIMARY KEY); > > Hangar > plane > ============= > 'B-1 Bomber' > 'B-52 Bomber' > 'F-14 Fighter' > > PilotSkills DIVIDED BY Hangar > pilot > ==================== ========= > 'Smith' > 'Wilson' > > In this example, Smith and Wilson are the two pilots who can fly > everything in the hangar. Notice that Higgins and Celko know how to > fly a Piper Cub, but we don't have one right now. In Codd's original > definition of relational division, having more rows than are called for > is not a problem. > > The important characteristic of a relational division is that the CROSS > JOIN (Cartesian product) of the divisor and the quotient produces a > valid subset of rows from the dividend. This is where the name comes > from, since the CROSS JOIN acts like a multiplication operator. > > Division with a Remainder > > There are two kinds of relational division. Division with a remainder > allows the dividend table to have more values than the divisor, which > was Codd's original definition. For example, if a pilot can fly more > planes than just those we have in the hangar, this is fine with us. > The query can be written in SQL-89 as > > SELECT DISTINCT pilot > FROM PilotSkills AS PS1 > WHERE NOT EXISTS > (SELECT * > FROM Hangar > WHERE NOT EXISTS > (SELECT * > FROM PilotSkills AS PS2 > WHERE (PS1.pilot = PS2.pilot) > AND (PS2.plane = Hangar.plane))); > > The quickest way to explain what is happening in this query is to > imagine an old World War II movie where a cocky pilot has just walked > into the hangar, looked over the fleet, and announced, "There ain't no > plane in this hangar that I can't fly!" We are finding the pilots for > whom there does not exist a plane in the hangar for which they have no > skills. The use of the NOT EXISTS() predicates is for speed. Most SQL > systems will look up a value in an index rather than scan the whole > table. The SELECT * clause lets the query optimizer choose the column > to use when looking for the index. > > This query for relational division was made popular by Chris Date in > his textbooks, but it is not the only method nor always the fastest. > Another version of the division can be written so as to avoid three > levels of nesting. While it is not original with me, I have made it > popular in my books. > > SELECT PS1.pilot > FROM PilotSkills AS PS1, Hangar AS H1 > WHERE PS1.plane = H1.plane > GROUP BY PS1.pilot > HAVING COUNT(PS1.plane) = (SELECT COUNT(plane) FROM Hangar); > > There is a serious difference in the two methods. Burn down the > hangar, so that the divisor is empty. Because of the NOT EXISTS() > predicates in Date's query, all pilots are returned from a division by > an empty set. Because of the COUNT() functions in my query, no pilots > are returned from a division by an empty set. > > In the sixth edition of his book, INTRODUCTION TO DATABASE SYSTEMS > (Addison-Wesley; 1995 ;ISBN 0-201-82458-2), Chris Date defined another > operator (DIVIDEBY ... PER) which produces the same results as my > query, but with more complexity. > > Exact Division > > The second kind of relational division is exact relational division. > The dividend table must match exactly to the values of the divisor > without any extra values. > > SELECT PS1.pilot > FROM PilotSkills AS PS1 > LEFT OUTER JOIN > Hangar AS H1 > ON PS1.plane = H1.plane > GROUP BY PS1.pilot > HAVING COUNT(PS1.plane) = (SELECT COUNT(plane) FROM Hangar) > AND COUNT(H1.plane) = (SELECT COUNT(plane) FROM Hangar); > > This says that a pilot must have the same number of certificates as > there planes in the hangar and these certificates all match to a plane > in the hangar, not something else. The "something else" is shown by a > created NULL from the LEFT OUTER JOIN. > > Please do not make the mistake of trying to reduce the HAVING clause > with a little algebra to: > > HAVING COUNT(PS1.plane) = COUNT(H1.plane) > > because it does not work; it will tell you that the hangar has (n) > planes in it and the pilot is certified for (n) planes, but not that > those two sets of planes are equal to each other. > > Note on Performance > > The nested EXISTS() predicates version of relational division was made > popular by Chris Date's textbooks, while the author is associated with > popularizing the COUNT(*) version of relational division. The Winter > 1996 edition of DB2 ON-LINE MAGAZINE > (http://www.db2mag.com/96011ar:htm) had an article entitled "Powerful > SQL:Beyond the Basics" by Sheryl Larsen which gave the results of > testing both methods. Her conclusion for DB2 was that the nested > EXISTS() version is better when the quotient has less than 25% of the > dividend table's rows and the COUNT(*) version is better when the > quotient is more than 25% of the dividend table. Thank you CELKO. Was this an answer specifically for me, or a cut and paste from elsewhere? Apart from the specific advice that my naming convention is silly, do you have any suggested SQL for the problem in hand? I can't quite make the jump from 'which pilots can fly all the planes in the hanger' to 'which contacts are short, fat and stupid'. I do already have two solutions and Dan Guzman suggested a third and all of these work. The question is really about optimizing the SQL now that the database has moved from Access to SQL Server.
Post Follow-up to this messageOn Fri, 22 Apr 2005 15:54:31 +0000 (UTC), Justin Hoffman wrote:
> Thank you CELKO. Was this an answer specifically for me, or a cut and pas
te
> from elsewhere?
It is his standard Relational Division piece.
> Apart from the specific advice that my naming convention is
> silly, do you have any suggested SQL for the problem in hand? I can't qui
te
> make the jump from 'which pilots can fly all the planes in the hanger' to
> 'which contacts are short, fat and stupid'. I do already have two solutio
ns
> and Dan Guzman suggested a third and all of these work. The question is
> really about optimizing the SQL now that the database has moved from Acces
s
> to SQL Server.
In fact, Celko's preferred Relational Division style amounts to the same
thing as your second solution, except that no "helper table" is really
necessary. Here is my take on it:
SELECT C.* FROM tblContact C
INNER JOIN tblConCat CC
ON C.ConID = CC.CctConID
WHERE CC.CctCatCode IN ('010101', '010102', '010103')
GROUP BY C.ConID
HAVING COUNT(*)=3
Now say you don't want to limit it to just those three categories. Suppose
your tblCategory looks like this:
CREATE TABLE tblCategory (
CatCode char(6),
CatName varchar(20),
InsultLevel int
)
INSERT INTO tblCategory VALUES ('010101', 'Short', 1)
INSERT INTO tblCategory VALUES ('010102', 'Fat', 2)
INSERT INTO tblCategory VALUES ('010103', 'Stupid', 2)
INSERT INTO tblCategory VALUES ('010104', 'Spanish', 0)
INSERT INTO tblCategory VALUES ('010105', 'Unemployable', 1)
And you want to find all the contacts who match all categories that have an
InsultLevel of at least 1. You don't have to know ahead of time what
categories those are, or even how many categories those are:
SELECT C.* FROM tblContact C
INNER JOIN tblConCat CC
ON C.ConID = CC.CctConID
INNER JOIN tblCategory CG
ON CC.CctCatCode = CG.CatCode
WHERE CG.InsultLevel >= 1
GROUP BY C.ConID
HAVING COUNT(*)=(
SELECT COUNT(*) FROM Category
WHERE Category.InsultLevel >= 1
)
Make sense?
Post Follow-up to this messageOn 22 Apr 2005 08:05:29 -0700, --CELKO-- wrote: (snip) >Exact Division (snip) >SELECT PS1.pilot > FROM PilotSkills AS PS1 > LEFT OUTER JOIN > Hangar AS H1 > ON PS1.plane = H1.plane > GROUP BY PS1.pilot > HAVING COUNT(PS1.plane) = (SELECT COUNT(plane) FROM Hangar) > AND COUNT(H1.plane) = (SELECT COUNT(plane) FROM Hangar); (snip) >Please do not make the mistake of trying to reduce the HAVING clause >with a little algebra to: > > HAVING COUNT(PS1.plane) = COUNT(H1.plane) > >because it does not work; Hi Joe, But (correct me if I'm wrong) there is nothing wrong with reducing the HAVING clause to HAVING COUNT(PS1.plane) = (SELECT COUNT(plane) FROM Hangar) AND COUNT(H1.plane) = COUNT(PS1.plane); Right? A clever optimizer would of course not perform the same subquery twice, but I'm not too convinced that all optimizers are that clever... (In fact, I just tested both on your sample data, and the execution plan shows that SQL Server 2000 will indeed perform the same COUNT subquery twice - at least with this small amount of rows. Maybe it will cough up a better plan when there are a few million rows involved?) Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address)
Post Follow-up to this message>> just tested both on your sample data, and the execution plan shows that SQL Server 2000 will indeed perform the same COUNT subquery twice - at least with this small amount of rows. << That is awful! That is a matter of a parser finding a deterministic common subexpression, not something hard. rows involved? << I don't know, but it looks like the optimizer is just plain stupid in the HAVING clause.
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread