| Author |
Processes blocked on tempdb resources and can't rollback
|
|
| pshroads@gmail.com 2006-02-28, 8:23 pm |
| I have a strange situation. A SPID is blocking 3 other SPIDs. The
blocking SPID's lastwaittype is PAGELATCH_UP and the waittype is 0x0413
(Update buffer page latch). The waitresource is 2:1:444 indicating
something in tempdb.
The blocked SPIDs all have the exact same lastwaittype, waittype and
waitresource as the blocking SPID. After the blocking SPID had been
running for over three hours we decided to kill it. It has now been
rolling back for over 15 hours. Sysprocesses shows no i/o or CPU
activity for either the blocking SPID or the blocked SPIDs. The only
thing moving is the waittime. I ran a profiler trace on the blocking
and blocked SPID and there appears to be no activity whatsoever.
Using sp_lock I can see that the blocking SPID (336) has an exclusive
lock on page 444 in tempdb:
spid dbid ObjId IndId Type Resource Mode Status
------ ------ ----------- ------ ---- ---------------- -------- ------
336 5 56035631 0 TAB Sch-S GRANT
336 2 1038803711 0 TAB X GRANT
336 2 0 0 EXT 1:1114744 X GRANT
336 5 1440724185 0 TAB Sch-S GRANT
336 5 597577167 0 TAB Sch-S GRANT
336 5 437576597 0 TAB Sch-S GRANT
336 5 569417448 0 TAB Sch-S GRANT
336 2 0 0 PAG 1:444 X GRANT
336 2 99 0 RID 1:652945:4 X GRANT
336 5 1229247434 0 TAB Sch-S GRANT
336 5 1870121903 0 TAB Sch-S GRANT
336 5 72035688 0 TAB Sch-S GRANT
336 5 0 0 DB S GRANT
The locks for the blocking spids are:
spid dbid ObjId IndId Type Resource Mode Status
------ ------ ----------- ------ ---- ---------------- -------- ------
199 5 0 0 DB S GRANT
199 2 2 0 TAB IX GRANT
199 2 3 0 TAB IX GRANT
199 2 3 1 KEY (4f00a7ed7a34) X GRANT
199 2 3 1 KEY (4d002c25739e) X GRANT
199 2 3 2 KEY (b500f9ccd622) X GRANT
199 2 3 1 KEY (4e00c28ac68c) X GRANT
199 2 1313460556 0 TAB Sch-M GRANT
199 2 3 2 KEY (f801aa1d1a1b) X GRANT
199 2 3 2 KEY (e00123c3615b) X GRANT
351 5 0 0 DB S GRANT
351 2 3 2 KEY (2c03f5e906a1) X GRANT
351 2 3 2 KEY (ef030f1cc5a3) X GRANT
351 2 959571505 0 TAB Sch-M GRANT
351 2 3 2 KEY (0b037f9b0b6b) X GRANT
351 2 3 2 KEY (3b028953ed31) X GRANT
351 2 3 2 KEY (3303ac26b345) X GRANT
351 2 3 2 KEY (1303b2e0451f) X GRANT
351 2 3 2 KEY (47045d5d8447) X GRANT
351 2 3 2 KEY (b604fc5513dc) X GRANT
351 2 3 1 KEY (3c00a669dc07) X GRANT
351 2 3 1 KEY (3f0094990230) X GRANT
351 2 3 1 KEY (3e007a36b722) X GRANT
351 2 3 1 KEY (38001ed60a4d) X GRANT
351 2 3 1 KEY (4400d6167192) X GRANT
351 2 3 1 KEY (33002c26d47a) X GRANT
351 2 3 1 KEY (3200c2896168) X GRANT
351 2 3 1 KEY (45006f2ea60f) X GRANT
351 2 3 2 KEY (d401767461f7) X GRANT
351 2 3 1 KEY (3d001f510b9a) X GRANT
351 2 3 0 TAB IX GRANT
351 2 2 0 TAB IX GRANT
351 2 3 1 KEY (4300b371cd2a) X GRANT
351 2 3 1 KEY (3600951e03e7) X GRANT
351 2 3 1 KEY (42005dde7838) X GRANT
351 2 3 1 KEY (37007bb1b6f5) X GRANT
351 2 3 1 KEY (3400494168c2) X GRANT
351 2 3 1 KEY (3a002da1d5ad) X GRANT
351 2 3 1 KEY (3b00c30e60bf) X GRANT
351 2 3 2 KEY (fa03774b7ea1) X GRANT
351 2 3 2 KEY (0903f111a581) X GRANT
351 2 3 1 KEY (4000f1febe88) X GRANT
351 2 3 2 KEY (1403d98217fd) X GRANT
351 2 3 2 KEY (de016b7f8f59) X GRANT
351 2 3 2 KEY (700148cf11b9) X GRANT
351 2 3 2 KEY (d203a4f6c68f) X GRANT
351 2 3 2 KEY (7a0355fa61d9) X GRANT
351 2 3 1 KEY (390048c66915) X GRANT
351 2 3 1 KEY (3500f079bf5f) X GRANT
351 2 3 1 KEY (410038b9c480) X GRANT
351 2 3 2 KEY (9a004a0b4b3d) X GRANT
351 2 3 2 KEY (a8039f428374) X GRANT
351 2 3 2 KEY (56040a0703ea) X GRANT
351 2 3 2 KEY (4a0414fbe815) X GRANT
361 5 0 0 DB S GRANT
361 2 3 0 TAB IX GRANT
361 2 3 2 KEY (6a05f3bf6a68) X GRANT
361 2 3 2 KEY (6505ef2c2b6c) X GRANT
361 2 3 2 KEY (49017efdb29b) X GRANT
361 2 2 0 TAB IX GRANT
361 2 3 1 KEY (e20001f6db04) X GRANT
361 2 1150700000 0 TAB Sch-M GRANT
361 2 3 1 KEY (e300649167bc) X GRANT
361 2 3 1 KEY (e100ef596e16) X GRANT
Using DBCC INPUTBUFFER I see that the blocking SPID is running a stored
procedure. Using fn_get_sql I see that the current command is a user
defined function within that procedure (but one that does not hit any
tables, it just parses a string). The UDF is called while updating a
temporary table.
I am stumped. Can anyone offer any suggestions?
Thanks!
| |
| JosephPruiett 2006-02-28, 8:23 pm |
| Could you please post more information about this issue, like how many
records are being pushed into tempdb table that the function is being ran on,
the details in the function, Etc.
That way we can better help you determin the issue.
"pshroads@gmail.com" wrote:
> I have a strange situation. A SPID is blocking 3 other SPIDs. The
> blocking SPID's lastwaittype is PAGELATCH_UP and the waittype is 0x0413
> (Update buffer page latch). The waitresource is 2:1:444 indicating
> something in tempdb.
>
> The blocked SPIDs all have the exact same lastwaittype, waittype and
> waitresource as the blocking SPID. After the blocking SPID had been
> running for over three hours we decided to kill it. It has now been
> rolling back for over 15 hours. Sysprocesses shows no i/o or CPU
> activity for either the blocking SPID or the blocked SPIDs. The only
> thing moving is the waittime. I ran a profiler trace on the blocking
> and blocked SPID and there appears to be no activity whatsoever.
>
> Using sp_lock I can see that the blocking SPID (336) has an exclusive
> lock on page 444 in tempdb:
>
> spid dbid ObjId IndId Type Resource Mode Status
> ------ ------ ----------- ------ ---- ---------------- -------- ------
> 336 5 56035631 0 TAB Sch-S GRANT
> 336 2 1038803711 0 TAB X GRANT
> 336 2 0 0 EXT 1:1114744 X GRANT
> 336 5 1440724185 0 TAB Sch-S GRANT
> 336 5 597577167 0 TAB Sch-S GRANT
> 336 5 437576597 0 TAB Sch-S GRANT
> 336 5 569417448 0 TAB Sch-S GRANT
> 336 2 0 0 PAG 1:444 X GRANT
> 336 2 99 0 RID 1:652945:4 X GRANT
> 336 5 1229247434 0 TAB Sch-S GRANT
> 336 5 1870121903 0 TAB Sch-S GRANT
> 336 5 72035688 0 TAB Sch-S GRANT
> 336 5 0 0 DB S GRANT
>
> The locks for the blocking spids are:
>
> spid dbid ObjId IndId Type Resource Mode Status
> ------ ------ ----------- ------ ---- ---------------- -------- ------
> 199 5 0 0 DB S GRANT
> 199 2 2 0 TAB IX GRANT
> 199 2 3 0 TAB IX GRANT
> 199 2 3 1 KEY (4f00a7ed7a34) X GRANT
> 199 2 3 1 KEY (4d002c25739e) X GRANT
> 199 2 3 2 KEY (b500f9ccd622) X GRANT
> 199 2 3 1 KEY (4e00c28ac68c) X GRANT
> 199 2 1313460556 0 TAB Sch-M GRANT
> 199 2 3 2 KEY (f801aa1d1a1b) X GRANT
> 199 2 3 2 KEY (e00123c3615b) X GRANT
>
> 351 5 0 0 DB S GRANT
> 351 2 3 2 KEY (2c03f5e906a1) X GRANT
> 351 2 3 2 KEY (ef030f1cc5a3) X GRANT
> 351 2 959571505 0 TAB Sch-M GRANT
> 351 2 3 2 KEY (0b037f9b0b6b) X GRANT
> 351 2 3 2 KEY (3b028953ed31) X GRANT
> 351 2 3 2 KEY (3303ac26b345) X GRANT
> 351 2 3 2 KEY (1303b2e0451f) X GRANT
> 351 2 3 2 KEY (47045d5d8447) X GRANT
> 351 2 3 2 KEY (b604fc5513dc) X GRANT
> 351 2 3 1 KEY (3c00a669dc07) X GRANT
> 351 2 3 1 KEY (3f0094990230) X GRANT
> 351 2 3 1 KEY (3e007a36b722) X GRANT
> 351 2 3 1 KEY (38001ed60a4d) X GRANT
> 351 2 3 1 KEY (4400d6167192) X GRANT
> 351 2 3 1 KEY (33002c26d47a) X GRANT
> 351 2 3 1 KEY (3200c2896168) X GRANT
> 351 2 3 1 KEY (45006f2ea60f) X GRANT
> 351 2 3 2 KEY (d401767461f7) X GRANT
> 351 2 3 1 KEY (3d001f510b9a) X GRANT
> 351 2 3 0 TAB IX GRANT
> 351 2 2 0 TAB IX GRANT
> 351 2 3 1 KEY (4300b371cd2a) X GRANT
> 351 2 3 1 KEY (3600951e03e7) X GRANT
> 351 2 3 1 KEY (42005dde7838) X GRANT
> 351 2 3 1 KEY (37007bb1b6f5) X GRANT
> 351 2 3 1 KEY (3400494168c2) X GRANT
> 351 2 3 1 KEY (3a002da1d5ad) X GRANT
> 351 2 3 1 KEY (3b00c30e60bf) X GRANT
> 351 2 3 2 KEY (fa03774b7ea1) X GRANT
> 351 2 3 2 KEY (0903f111a581) X GRANT
> 351 2 3 1 KEY (4000f1febe88) X GRANT
> 351 2 3 2 KEY (1403d98217fd) X GRANT
> 351 2 3 2 KEY (de016b7f8f59) X GRANT
> 351 2 3 2 KEY (700148cf11b9) X GRANT
> 351 2 3 2 KEY (d203a4f6c68f) X GRANT
> 351 2 3 2 KEY (7a0355fa61d9) X GRANT
> 351 2 3 1 KEY (390048c66915) X GRANT
> 351 2 3 1 KEY (3500f079bf5f) X GRANT
> 351 2 3 1 KEY (410038b9c480) X GRANT
> 351 2 3 2 KEY (9a004a0b4b3d) X GRANT
> 351 2 3 2 KEY (a8039f428374) X GRANT
> 351 2 3 2 KEY (56040a0703ea) X GRANT
> 351 2 3 2 KEY (4a0414fbe815) X GRANT
>
> 361 5 0 0 DB S GRANT
> 361 2 3 0 TAB IX GRANT
> 361 2 3 2 KEY (6a05f3bf6a68) X GRANT
> 361 2 3 2 KEY (6505ef2c2b6c) X GRANT
> 361 2 3 2 KEY (49017efdb29b) X GRANT
> 361 2 2 0 TAB IX GRANT
> 361 2 3 1 KEY (e20001f6db04) X GRANT
> 361 2 1150700000 0 TAB Sch-M GRANT
> 361 2 3 1 KEY (e300649167bc) X GRANT
> 361 2 3 1 KEY (e100ef596e16) X GRANT
>
> Using DBCC INPUTBUFFER I see that the blocking SPID is running a stored
> procedure. Using fn_get_sql I see that the current command is a user
> defined function within that procedure (but one that does not hit any
> tables, it just parses a string). The UDF is called while updating a
> temporary table.
>
> I am stumped. Can anyone offer any suggestions?
>
> Thanks!
>
>
| |
| pshroads@gmail.com 2006-02-28, 8:24 pm |
| I'm not sure how many records were being affected in this particular
instance. The stored procedure is called by a background task in our
application. It runs every 30 minues and ran with no problem until
yesterday. There are no errors recorded anywhere -- the spid has just
been apparently idle not doing any work and blocking other spids. I ran
the stored procedure maually from Query Analyzer and it ran
successfully.
The UDF just parses a string and inserts the output in to a temporary
table. The UDF does not access any tables.
We have since tried killing the SPIDs that were being blocked as well.
Now all the SPIDS - both the blocker and the blockees - have been in a
rollback state for hours.
| |
| Andrew J. Kelly 2006-02-28, 8:24 pm |
| Not sure if this is all your problem or not but it does seem to apply and is
always a good idea to implement.
http://support.microsoft.com/kb/328551
--
Andrew J. Kelly SQL MVP
<pshroads@gmail.com> wrote in message
news:1141143315.988766.304430@j33g2000cwa.googlegroups.com...
>I have a strange situation. A SPID is blocking 3 other SPIDs. The
> blocking SPID's lastwaittype is PAGELATCH_UP and the waittype is 0x0413
> (Update buffer page latch). The waitresource is 2:1:444 indicating
> something in tempdb.
>
> The blocked SPIDs all have the exact same lastwaittype, waittype and
> waitresource as the blocking SPID. After the blocking SPID had been
> running for over three hours we decided to kill it. It has now been
> rolling back for over 15 hours. Sysprocesses shows no i/o or CPU
> activity for either the blocking SPID or the blocked SPIDs. The only
> thing moving is the waittime. I ran a profiler trace on the blocking
> and blocked SPID and there appears to be no activity whatsoever.
>
> Using sp_lock I can see that the blocking SPID (336) has an exclusive
> lock on page 444 in tempdb:
>
> spid dbid ObjId IndId Type Resource Mode Status
> ------ ------ ----------- ------ ---- ---------------- -------- ------
> 336 5 56035631 0 TAB Sch-S GRANT
> 336 2 1038803711 0 TAB X GRANT
> 336 2 0 0 EXT 1:1114744 X GRANT
> 336 5 1440724185 0 TAB Sch-S GRANT
> 336 5 597577167 0 TAB Sch-S GRANT
> 336 5 437576597 0 TAB Sch-S GRANT
> 336 5 569417448 0 TAB Sch-S GRANT
> 336 2 0 0 PAG 1:444 X GRANT
> 336 2 99 0 RID 1:652945:4 X GRANT
> 336 5 1229247434 0 TAB Sch-S GRANT
> 336 5 1870121903 0 TAB Sch-S GRANT
> 336 5 72035688 0 TAB Sch-S GRANT
> 336 5 0 0 DB S GRANT
>
> The locks for the blocking spids are:
>
> spid dbid ObjId IndId Type Resource Mode Status
> ------ ------ ----------- ------ ---- ---------------- -------- ------
> 199 5 0 0 DB S GRANT
> 199 2 2 0 TAB IX GRANT
> 199 2 3 0 TAB IX GRANT
> 199 2 3 1 KEY (4f00a7ed7a34) X GRANT
> 199 2 3 1 KEY (4d002c25739e) X GRANT
> 199 2 3 2 KEY (b500f9ccd622) X GRANT
> 199 2 3 1 KEY (4e00c28ac68c) X GRANT
> 199 2 1313460556 0 TAB Sch-M GRANT
> 199 2 3 2 KEY (f801aa1d1a1b) X GRANT
> 199 2 3 2 KEY (e00123c3615b) X GRANT
>
> 351 5 0 0 DB S GRANT
> 351 2 3 2 KEY (2c03f5e906a1) X GRANT
> 351 2 3 2 KEY (ef030f1cc5a3) X GRANT
> 351 2 959571505 0 TAB Sch-M GRANT
> 351 2 3 2 KEY (0b037f9b0b6b) X GRANT
> 351 2 3 2 KEY (3b028953ed31) X GRANT
> 351 2 3 2 KEY (3303ac26b345) X GRANT
> 351 2 3 2 KEY (1303b2e0451f) X GRANT
> 351 2 3 2 KEY (47045d5d8447) X GRANT
> 351 2 3 2 KEY (b604fc5513dc) X GRANT
> 351 2 3 1 KEY (3c00a669dc07) X GRANT
> 351 2 3 1 KEY (3f0094990230) X GRANT
> 351 2 3 1 KEY (3e007a36b722) X GRANT
> 351 2 3 1 KEY (38001ed60a4d) X GRANT
> 351 2 3 1 KEY (4400d6167192) X GRANT
> 351 2 3 1 KEY (33002c26d47a) X GRANT
> 351 2 3 1 KEY (3200c2896168) X GRANT
> 351 2 3 1 KEY (45006f2ea60f) X GRANT
> 351 2 3 2 KEY (d401767461f7) X GRANT
> 351 2 3 1 KEY (3d001f510b9a) X GRANT
> 351 2 3 0 TAB IX GRANT
> 351 2 2 0 TAB IX GRANT
> 351 2 3 1 KEY (4300b371cd2a) X GRANT
> 351 2 3 1 KEY (3600951e03e7) X GRANT
> 351 2 3 1 KEY (42005dde7838) X GRANT
> 351 2 3 1 KEY (37007bb1b6f5) X GRANT
> 351 2 3 1 KEY (3400494168c2) X GRANT
> 351 2 3 1 KEY (3a002da1d5ad) X GRANT
> 351 2 3 1 KEY (3b00c30e60bf) X GRANT
> 351 2 3 2 KEY (fa03774b7ea1) X GRANT
> 351 2 3 2 KEY (0903f111a581) X GRANT
> 351 2 3 1 KEY (4000f1febe88) X GRANT
> 351 2 3 2 KEY (1403d98217fd) X GRANT
> 351 2 3 2 KEY (de016b7f8f59) X GRANT
> 351 2 3 2 KEY (700148cf11b9) X GRANT
> 351 2 3 2 KEY (d203a4f6c68f) X GRANT
> 351 2 3 2 KEY (7a0355fa61d9) X GRANT
> 351 2 3 1 KEY (390048c66915) X GRANT
> 351 2 3 1 KEY (3500f079bf5f) X GRANT
> 351 2 3 1 KEY (410038b9c480) X GRANT
> 351 2 3 2 KEY (9a004a0b4b3d) X GRANT
> 351 2 3 2 KEY (a8039f428374) X GRANT
> 351 2 3 2 KEY (56040a0703ea) X GRANT
> 351 2 3 2 KEY (4a0414fbe815) X GRANT
>
> 361 5 0 0 DB S GRANT
> 361 2 3 0 TAB IX GRANT
> 361 2 3 2 KEY (6a05f3bf6a68) X GRANT
> 361 2 3 2 KEY (6505ef2c2b6c) X GRANT
> 361 2 3 2 KEY (49017efdb29b) X GRANT
> 361 2 2 0 TAB IX GRANT
> 361 2 3 1 KEY (e20001f6db04) X GRANT
> 361 2 1150700000 0 TAB Sch-M GRANT
> 361 2 3 1 KEY (e300649167bc) X GRANT
> 361 2 3 1 KEY (e100ef596e16) X GRANT
>
> Using DBCC INPUTBUFFER I see that the blocking SPID is running a stored
> procedure. Using fn_get_sql I see that the current command is a user
> defined function within that procedure (but one that does not hit any
> tables, it just parses a string). The UDF is called while updating a
> temporary table.
>
> I am stumped. Can anyone offer any suggestions?
>
> Thanks!
>
| |
| pshroads@gmail.com 2006-03-05, 8:23 pm |
| Thanks for the reply. I have seen this article before and while it does
seem like a good idea in general I don't think it applies in my case. I
don't see any contention for the PFS or SGAM pages, only for page 444.
The SPIDs have now been rolling back for over two days. They had only
run for 3 hours when I killed them. The cause of the initial blocking
is one thing but the fact that these SPIDs can't even rollback is a
seperate issue.
| |
| Andrew J. Kelly 2006-03-05, 8:23 pm |
| Yes I didn't think it would fix the rollback issue but you did have
PAGELATCH_UP waits which are one of the indications that you have contention
internal to the files. There is no down side to adding multiple files to
tempdb regardless of if it fixes this issue or not. But it may play a part
in how it all started though I have not seen these adverse effects due
solely to the contention.
--
Andrew J. Kelly SQL MVP
<pshroads@gmail.com> wrote in message
news:1141229299.327385.262230@z34g2000cwc.googlegroups.com...
> Thanks for the reply. I have seen this article before and while it does
> seem like a good idea in general I don't think it applies in my case. I
> don't see any contention for the PFS or SGAM pages, only for page 444.
> The SPIDs have now been rolling back for over two days. They had only
> run for 3 hours when I killed them. The cause of the initial blocking
> is one thing but the fact that these SPIDs can't even rollback is a
> seperate issue.
>
| |
| pshroads@gmail.com 2006-03-05, 8:23 pm |
| Andrew -
The article about fixig tempdb contention says that, "This problem was
first corrected in Microsoft SQL Server 2000 Service Pack 4." I am
running sp4. I'm not clear on whether that means that it's no longer
necessary to implement the changes in the article to reduce contention.
Do you know?
BTW, the original problem with the blocking turned critical for us
yesterday. It seemed like the blocking was just harmless but in fact it
was causing the tempdb transaciton log to be unable to truncate. It
eventually grew and filled up the disk that tempdb is on. We had to
restart SQL Server and incur unscheduled downtime. I just have never
seen a sutuation where SQL Server was unable to rollback a SPID.
| |
| Andrew J. Kelly 2006-03-05, 8:23 pm |
| I checked earlier to see if you had SP4 but didn't find that info in your
previous posts. So if you have SP4 you don't need the hotfix but you still
need to create the multiple files for it to be effective. There is a KB
article that discusses issues where the log will not be truncated
automatically in Simple mode and cause the log to grow but I do not remember
anything in regards to not being able to roll back trans. If it happens
again I would give MS PSS a call.
--
Andrew J. Kelly SQL MVP
<pshroads@gmail.com> wrote in message
news:1141316093.990669.12310@j33g2000cwa.googlegroups.com...
> Andrew -
>
> The article about fixig tempdb contention says that, "This problem was
> first corrected in Microsoft SQL Server 2000 Service Pack 4." I am
> running sp4. I'm not clear on whether that means that it's no longer
> necessary to implement the changes in the article to reduce contention.
> Do you know?
>
> BTW, the original problem with the blocking turned critical for us
> yesterday. It seemed like the blocking was just harmless but in fact it
> was causing the tempdb transaciton log to be unable to truncate. It
> eventually grew and filled up the disk that tempdb is on. We had to
> restart SQL Server and incur unscheduled downtime. I just have never
> seen a sutuation where SQL Server was unable to rollback a SPID.
>
| |
| pshroads@gmail.com 2006-03-05, 8:23 pm |
| Thanks - sorry, one last question: do I need the Trace Flag -T1118 with
multiple files or was that just pre sp4?
Thanks again!
| |
| Andrew J. Kelly 2006-03-05, 8:24 pm |
| They say that the trace flag should not be required for the contention in
tempdb with SP4.
--
Andrew J. Kelly SQL MVP
<pshroads@gmail.com> wrote in message
news:1141323974.465926.205640@u72g2000cwu.googlegroups.com...
> Thanks - sorry, one last question: do I need the Trace Flag -T1118 with
> multiple files or was that just pre sp4?
>
> Thanks again!
>
|
|
|
|