|
Home > Archive > Microsoft SQL Server forum > February 2006 > Subquery -> slowness
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Subquery -> slowness
|
|
| jim_geissman@countrywide.com 2006-02-25, 9:45 am |
| I have some queries that involve subqueries to select the appropriate
record from a bunch of candidates. For example the following, which
selects the most recent transaction for a given customer:
CREATE TABLE #Customer (CustID int, OtherInfo int)
INSERT #Customer SELECT 1,7
INSERT #Customer SELECT 2,8
INSERT #Customer SELECT 3,9
CREATE TABLE #Event (CustID int, EventID int, EventAmt int, EventDt
smalldatetime)
INSERT #Event SELECT 1,1,1000,'20060224'
INSERT #Event SELECT 2,1,2000,'20060224'
INSERT #Event SELECT 3,2,3000,'20060224'
INSERT #Event SELECT 3,1,5000,'20060225'
SELECT c.CustID,c.OtherInfo ,e.EventAmt,e.EventDt
FROM #Customer c JOIN #Event e ON c.CustID=e.CustID
WHERE EventDt = (SELECT MAX(EventDt) FROM #Event WHERE CustID=c.CustID)
ORDER BY c.CustID
Over millions of customers and events, this takes forever. Creating a
temp table which identifies the appropriate dates, and then joining to
that, speeds things up considerably -- the following two queries
together take a lot less time the the one above.
CREATE TABLE #Temp (CustID int,EventDt smalldatetime)
INSERT #Temp SELECT CustID,MAX(EventDt)F
ROM #Event GROUP BY CustID
SELECT c.CustID,c.OtherInfo ,e.EventAmt,e.EventDt
FROM #Customer c
JOIN #Temp t ON t.CustID=c.CustID
JOIN #Event e ON c.CustID=e.CustID AND t.EventDt=e.EventDt
This seems pretty simple, and I would assume the query planner should
be able to figure it out without assistance. Is there a better way to
forumulate the original query? BTW, as far as I can tell the indexes
are appropriate for these queries; however the subqueries seem to be
done one at a time.
| |
| Alexander Kuznetsov 2006-02-25, 9:45 am |
| Jim,
you are correct: although sometimes the optimizer will refactor queries
with subqueries as joins, do not rely on it doing that in all the
cases.
Try this:
SELECT c.CustID,c.OtherInfo ,e.EventAmt,e.EventDt
FROM #Customer c JOIN #Event e ON c.CustID=e.CustID
JOIN
(SELECT MAX(EventDt) maxEventDt, CustID FROM #Event e group by CustID)
m ON c.CustID=e.CustID and .EventDt = m.maxEventDt
ORDER BY c.CustID
Good luck!
| |
| jim_geissman@countrywide.com 2006-02-25, 9:45 am |
| Thanks, Alexander. I'll try that.
Regards,
Jim
|
|
|
|
|