Home > Archive > Microsoft SQL Server forum > February 2006 > Subquery -> slowness









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Subquery -> slowness
jim_geissman@countrywide.com

2006-02-25, 9:45 am

I have some queries that involve subqueries to select the appropriate
record from a bunch of candidates. For example the following, which
selects the most recent transaction for a given customer:

CREATE TABLE #Customer (CustID int, OtherInfo int)
INSERT #Customer SELECT 1,7
INSERT #Customer SELECT 2,8
INSERT #Customer SELECT 3,9

CREATE TABLE #Event (CustID int, EventID int, EventAmt int, EventDt
smalldatetime)
INSERT #Event SELECT 1,1,1000,'20060224'
INSERT #Event SELECT 2,1,2000,'20060224'
INSERT #Event SELECT 3,2,3000,'20060224'
INSERT #Event SELECT 3,1,5000,'20060225'

SELECT c.CustID,c.OtherInfo ,e.EventAmt,e.EventDt
FROM #Customer c JOIN #Event e ON c.CustID=e.CustID
WHERE EventDt = (SELECT MAX(EventDt) FROM #Event WHERE CustID=c.CustID)
ORDER BY c.CustID

Over millions of customers and events, this takes forever. Creating a
temp table which identifies the appropriate dates, and then joining to
that, speeds things up considerably -- the following two queries
together take a lot less time the the one above.

CREATE TABLE #Temp (CustID int,EventDt smalldatetime)
INSERT #Temp SELECT CustID,MAX(EventDt)F
ROM #Event GROUP BY CustID

SELECT c.CustID,c.OtherInfo ,e.EventAmt,e.EventDt
FROM #Customer c
JOIN #Temp t ON t.CustID=c.CustID
JOIN #Event e ON c.CustID=e.CustID AND t.EventDt=e.EventDt

This seems pretty simple, and I would assume the query planner should
be able to figure it out without assistance. Is there a better way to
forumulate the original query? BTW, as far as I can tell the indexes
are appropriate for these queries; however the subqueries seem to be
done one at a time.

Alexander Kuznetsov

2006-02-25, 9:45 am

Jim,

you are correct: although sometimes the optimizer will refactor queries
with subqueries as joins, do not rely on it doing that in all the
cases.

Try this:

SELECT c.CustID,c.OtherInfo ,e.EventAmt,e.EventDt
FROM #Customer c JOIN #Event e ON c.CustID=e.CustID
JOIN
(SELECT MAX(EventDt) maxEventDt, CustID FROM #Event e group by CustID)
m ON c.CustID=e.CustID and .EventDt = m.maxEventDt
ORDER BY c.CustID

Good luck!

jim_geissman@countrywide.com

2006-02-25, 9:45 am

Thanks, Alexander. I'll try that.

Regards,
Jim

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com