Home > Archive > MS SQL Server > February 2006 > Isolation levels and SELECT's (even when using SERIALIZABLE!)









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Isolation levels and SELECT's (even when using SERIALIZABLE!)
JBilger

2006-02-03, 8:23 pm

I thought that it would be interesting to point out some very subtle
issues that can occur when using SELECT's inside a transaction that
also includes INSERTS/UPDATES or DELETES.). The issues really arise
when concurrency occurs (ie many users trying to execute this
transaction at the same time or even a user submitting a request for
the same transaction multiple times (by mistake)).

Let's just take the extreme example of using the SERIALIZABLE isolation
level and a simple transaction that only includes 1 SELECT and 1
INSERT. Moreover, let's assume that the only case we care about is
detecting when a user has submitted the same request for this
transaction multiple times (by mistake and hence we want to accept and
process the first and ignore the duplicate requests).

For example, due to the nature of Internet based services, it is
possible to receive multiple duplicate requests. The userID and
requestID pair will allow us to detect duplicate requests.

The entire goal of this transaction is to detect and reject duplicate
requests as well as detect and accept (process) valid (non-duplicate)
requests.

-------------------- begin transaction code ---------------------

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION

SET @OUT_result = (SELECT count(*)
FROM userRequests
WHERE userID = @userID and requestID = @requestID
)

--- if no records were returned in the query above, that means that
this is not a duplicate request. So store the userID and requestID so
that subsequent duplicate requests can be detected and ignored
if (@OUT_result = 0)
BEGIN
INSERT INTO userRequests(userID,
requestID)
VALUES (@userID, @requestID)

--
-- logic removed to process the request
--
END

COMMIT TRANSACTION

-------------------- end transaction code ---------------------


Assume that userID:123 submits a request with requestID:5 multiple
times (this could also occur with a faulty database driver that
executes the same stored procedure call multiple times). Also assume
that at this point the userID/requestID pair (123,5) does not exist in
the userRequests table,

Imagine that the requests are received by the database server at the
same time where r1 is the first request received and r2 is the
subsequent (duplicate) request received.

r1 begins execution and the SELECT statement is processed. However, it
has not yet started executing the INSERT statement. From reading the
description of the SERIALIZABLE isolation level you may conclude that
the very fact of executing the SELECT statement would lock any other
process (ie r2) from *reading* that same record in the userRequests
table, but it does not. In fact, only when reading the details of
exactly which locks are acquired for SELECT statements in a
SERIALIZABLE isolation level do things become clear (only Shared locks
are acquired). The only time an exclusive lock would be acquired would
be when the INSERT is executed and that would be held until the end of
the transaction.

Imagine that it takes r1 8 seconds to return a result for the SELECT
query (again an exaggeration to make a point), during that time, r2
arrives and also executes the SELECT query. Since there are only S
locks on that table/page both processes (r1 and r2) will be able to
obtain the results for the query (result will be 0 for both). r1 will
then execute the INSERT statement first and then r2 will execute the
INSERT statement.

This is clearly not what was intended!

In the worst case, you get duplicate rows in your userRequests table,
at best your get a unique index constraint error (if you set up the
appropriate indicies).

------------------------ important point -------------------------
Many people believe that just because your transaction uses isolation
REPEATABLE READ
or SERIALIZABLE then you will not have concurrency problems. In fact,
many subtle bugs can occur with SELECT statements especially if they
occur first in the transaction before any INSERT/UPDATES/DELETES on the
same table.

Many books incorrecly state that the SERIALIZABLE isolation level
"totally isolates transactions from each other." This is completely
untrue and misleading.
-------------------------------------------------------------------------


To get around this, you have to add the "PAGLOCK XLOCK" hint to the
SELECT statement as in:

SET @OUT_result = (SELECT count(*)
FROM userRequests WITH (PAGLOCK XLOCK)
WHERE userID = @userID and requestID = @requestID
)

only then will an X lock be obtained for the page, hence blocking any
other processes from reading rows on that page. It's interesting to not
that SQL server (2000) does not provide an option to exclusively lock a
particular row. That would minimize blocking.

It's also interesting to note ANSI-SQL92 does not define some other
more constrictive isolation levels. At least one that truly makes a
transaction isolated from others.

Anyone have comments? Thoughts?

Thanks
Jeff

Brian Selzer

2006-02-03, 8:23 pm

I think that your thinking is reversed. Transaction isolation levels
determine how other transactions can affect the current transaction, not how
the current transaction may affect other transactions. SERIALIZABLE ensures
that no other transaction can affect the results of any query that you
issue. It does NOT prevent other transactions from querying the same
information. It does prevent other transactions from issuing modifications
that were they to commit, would change the results of your query. For
example, if your query involves a range, SERIALIZABLE not only prevents rows
that you've read from being changed or deleted, it also prevents rows that
would fall within that range from being inserted as well as changes to rows
that you didn't read that would make them fall within that range.
SERIALIZABLE applies a shared range-lock to the index that governs any range
specified in the query.

Note: I'm making a distinction here between a query and a modification. The
transaction isolation level does not control the behavior of any
modification to the database. It will affect the SELECT clause from an
INSERT...SELECT, but the rows accessed by the SELECT clause will by default
have shared locks applied, not exclusive locks, but the INSERTed rows will
always have exclusive locks applied. An exclusive lock is always, ALWAYS
held until the end of the transaction. It doesn't matter if the isolation
level is READ UNCOMMITTED, any exclusive locks applied will be held until
the final commit or a rollback.

Exclusive locks are seldom necessary on a SELECT. You should read up on
update locks. Update locks do not prevent other readers, but only one
transaction at a time can obtain an update lock on a resource. Update locks
were designed to prevent deadlocks due to mixing queries and modifications.
They can also be used to control the order in which locks are obtained,
further minimizing deadlocks.



"JBilger" <bilgerbrau@gmail.com> wrote in message
news:1139010876.106242.274630@g44g2000cwa.googlegroups.com...
>I thought that it would be interesting to point out some very subtle
> issues that can occur when using SELECT's inside a transaction that
> also includes INSERTS/UPDATES or DELETES.). The issues really arise
> when concurrency occurs (ie many users trying to execute this
> transaction at the same time or even a user submitting a request for
> the same transaction multiple times (by mistake)).
>
> Let's just take the extreme example of using the SERIALIZABLE isolation
> level and a simple transaction that only includes 1 SELECT and 1
> INSERT. Moreover, let's assume that the only case we care about is
> detecting when a user has submitted the same request for this
> transaction multiple times (by mistake and hence we want to accept and
> process the first and ignore the duplicate requests).
>
> For example, due to the nature of Internet based services, it is
> possible to receive multiple duplicate requests. The userID and
> requestID pair will allow us to detect duplicate requests.
>
> The entire goal of this transaction is to detect and reject duplicate
> requests as well as detect and accept (process) valid (non-duplicate)
> requests.
>
> -------------------- begin transaction code ---------------------
>
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
> BEGIN TRANSACTION
>
> SET @OUT_result = (SELECT count(*)
> FROM userRequests
> WHERE userID = @userID and requestID = @requestID
> )
>
> --- if no records were returned in the query above, that means that
> this is not a duplicate request. So store the userID and requestID so
> that subsequent duplicate requests can be detected and ignored
> if (@OUT_result = 0)
> BEGIN
> INSERT INTO userRequests(userID,
requestID)
> VALUES (@userID, @requestID)
>
> --
> -- logic removed to process the request
> --
> END
>
> COMMIT TRANSACTION
>
> -------------------- end transaction code ---------------------
>
>
> Assume that userID:123 submits a request with requestID:5 multiple
> times (this could also occur with a faulty database driver that
> executes the same stored procedure call multiple times). Also assume
> that at this point the userID/requestID pair (123,5) does not exist in
> the userRequests table,
>
> Imagine that the requests are received by the database server at the
> same time where r1 is the first request received and r2 is the
> subsequent (duplicate) request received.
>
> r1 begins execution and the SELECT statement is processed. However, it
> has not yet started executing the INSERT statement. From reading the
> description of the SERIALIZABLE isolation level you may conclude that
> the very fact of executing the SELECT statement would lock any other
> process (ie r2) from *reading* that same record in the userRequests
> table, but it does not. In fact, only when reading the details of
> exactly which locks are acquired for SELECT statements in a
> SERIALIZABLE isolation level do things become clear (only Shared locks
> are acquired). The only time an exclusive lock would be acquired would
> be when the INSERT is executed and that would be held until the end of
> the transaction.
>
> Imagine that it takes r1 8 seconds to return a result for the SELECT
> query (again an exaggeration to make a point), during that time, r2
> arrives and also executes the SELECT query. Since there are only S
> locks on that table/page both processes (r1 and r2) will be able to
> obtain the results for the query (result will be 0 for both). r1 will
> then execute the INSERT statement first and then r2 will execute the
> INSERT statement.
>
> This is clearly not what was intended!
>
> In the worst case, you get duplicate rows in your userRequests table,
> at best your get a unique index constraint error (if you set up the
> appropriate indicies).
>
> ------------------------ important point -------------------------
> Many people believe that just because your transaction uses isolation
> REPEATABLE READ
> or SERIALIZABLE then you will not have concurrency problems. In fact,
> many subtle bugs can occur with SELECT statements especially if they
> occur first in the transaction before any INSERT/UPDATES/DELETES on the
> same table.
>
> Many books incorrecly state that the SERIALIZABLE isolation level
> "totally isolates transactions from each other." This is completely
> untrue and misleading.
> -------------------------------------------------------------------------
>
>
> To get around this, you have to add the "PAGLOCK XLOCK" hint to the
> SELECT statement as in:
>
> SET @OUT_result = (SELECT count(*)
> FROM userRequests WITH (PAGLOCK XLOCK)
> WHERE userID = @userID and requestID = @requestID
> )
>
> only then will an X lock be obtained for the page, hence blocking any
> other processes from reading rows on that page. It's interesting to not
> that SQL server (2000) does not provide an option to exclusively lock a
> particular row. That would minimize blocking.
>
> It's also interesting to note ANSI-SQL92 does not define some other
> more constrictive isolation levels. At least one that truly makes a
> transaction isolated from others.
>
> Anyone have comments? Thoughts?
>
> Thanks
> Jeff
>



Erland Sommarskog

2006-02-04, 11:23 am

JBilger (bilgerbrau@gmail.com) writes:
> Imagine that it takes r1 8 seconds to return a result for the SELECT
> query (again an exaggeration to make a point), during that time, r2
> arrives and also executes the SELECT query. Since there are only S
> locks on that table/page both processes (r1 and r2) will be able to
> obtain the results for the query (result will be 0 for both). r1 will
> then execute the INSERT statement first and then r2 will execute the
> INSERT statement.
>
> This is clearly not what was intended!
>
> In the worst case, you get duplicate rows in your userRequests table,
> at best your get a unique index constraint error (if you set up the
> appropriate indicies).


I would expect neither. I would expect a deadlock, because under a
serializable transaction the COUNT(*) should return the same result
if resubmitted. Thus, none of the processes should be permitted to
insert data that would alter the result of the SELECT query for the
other process.

> To get around this, you have to add the "PAGLOCK XLOCK" hint to the
> SELECT statement as in:
>
> SET @OUT_result = (SELECT count(*)
> FROM userRequests WITH (PAGLOCK XLOCK)
> WHERE userID = @userID and requestID = @requestID
> )


Just to echo what Brian said, but more clearly: the hint you should
use is UPDLOCK. An UPDLOCK is a shared lock that does not block readers.
However, only one process at a time have an UPDLOCK on a resource,
so thus, the process that comes in second, will be held up here until
the other process have committed.

Using page locks is unnecessarily measures. This will reduce concurrency,
since it will also block unrelated requests from proceeding.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx
JBilger

2006-02-06, 8:23 pm

Thanks for the detailed responses Erland and Brian!

SQL query analyzer does indeed show that for the SELECT statement,
UPDLOCK causes the Key RangeS-U lock to be granted to the first process
while the Key RangeS-U lock for the second process is in WAIT. Much
better than exclusive locking an entire page.

Thanks again.
J

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2009 droptable.com