I'm trying to understand ACID properties and how they affect how we think about concurrency in an ACID database. Let's say I have a table accounts
with account_id
and balance
field, and I have three rows in the database:
account_id | balance
-----------|--------
1 | 100
2 | 0
3 | 0
Now suppose I run the following transactions concurrently:
start transaction;
if (select balance from accounts where account_id = 1) >= 100 {
update accounts set balance = 100 where account_id = 2;
update accounts set balance = 0 where account_id = 1;
}
commit transaction;
and
start transaction;
if (select balance from accounts where account_id = 1) >= 100 {
update accounts set balance = 100 where account_id = 3;
update accounts set balance = 0 where account_id = 1;
}
commit transaction;
Note that the first updates account 2, and the second updates account 3. Is it possible that the table ends up in the following state:
account_id | balance
-----------|--------
1 | 0
2 | 100
3 | 100
In other words, is it possible that the balance is double spent? Let's assume we are using SQL server.
Is it possible for two SQL transactions to interleave on read?
Yes.
Different DBMSs may use different approaches to dealing with concurrency. ACID is not the full picture. SQL standard also defines several so called transaction isolation levels. These levels and their implementation in a specific DBMS (row versioning or locking) would define what happens in your example.
By default SQL Server uses READ COMMITTED
transaction isolation level without row versioning.
At this level it is easy to double-spend the account balance.
I used this helper stored procedure in the code below to print messages in the SSMS Messages pane:
CREATE PROCEDURE [dbo].[DebugPrintMessage]
@ParamMessage nvarchar(4000)
AS
BEGIN
SET NOCOUNT ON;
SET XACT_ABORT ON;
-- Escape the % symbol in the message, if it is there
SET @ParamMessage = REPLACE(@ParamMessage, '%', '%%');
-- Prepend message with the current timestamp to a second precision
SET @ParamMessage = CONVERT(nvarchar(19), SYSDATETIME(), 121) + ' ' + @ParamMessage;
RAISERROR (@ParamMessage, 0, 1) WITH NOWAIT;
-- PRINT command does not send the message to the client until its buffer is full, or the batch ends
-- RAISERROR () WITH NOWAIT sends a message immediately
END
GO
I opened two windows/connections/sessions in SSMS and put your code in each window. (One window updated ID=2, another ID=3, I'm not repeating it here)
EXEC dbo.DebugPrintMessage 'waiting to start';
-- round up the current time to the next 30 seconds
DECLARE @StartDateTime datetime2(0) = SYSDATETIME();
SET @StartDateTime = DATEADD(second, (DATEDIFF(second, '2020-01-01', @StartDateTime) / 30 + 1) * 30, '2020-01-01');
DECLARE @StartTimeString varchar(8);
SET @StartTimeString = CONVERT(varchar(8), @StartDateTime, 108);
WAITFOR TIME @StartTimeString;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
--SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
--SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
EXEC dbo.DebugPrintMessage 'began transaction';
IF (select balance from accounts where ID = 1) >= 100
BEGIN
EXEC dbo.DebugPrintMessage 'waiting 2 sec';
WAITFOR DELAY '00:00:02';
EXEC dbo.DebugPrintMessage 'first update';
UPDATE dbo.Accounts
SET Balance = 100
-- WHERE ID = 2
WHERE ID = 3
;
EXEC dbo.DebugPrintMessage 'second update';
UPDATE dbo.Accounts
SET Balance = 0
WHERE ID = 1
;
END
EXEC dbo.DebugPrintMessage 'waiting for review';
WAITFOR DELAY '00:01:00';
EXEC dbo.DebugPrintMessage 'committing';
COMMIT;
Both transactions completed successfully and two accounts had balance of 100 in the end.
The output shows how it was executed:
Session 1
2021-03-29 19:12:13 waiting to start
2021-03-29 19:12:30 began transaction
2021-03-29 19:12:30 waiting 2 sec
2021-03-29 19:12:32 first update
(1 row affected)
2021-03-29 19:12:32 second update
(1 row affected)
2021-03-29 19:12:32 waiting for review
2021-03-29 19:13:32 committing
Session 2
2021-03-29 19:12:11 waiting to start
2021-03-29 19:12:30 began transaction
2021-03-29 19:12:30 waiting 2 sec
2021-03-29 19:12:32 first update
(1 row affected)
2021-03-29 19:12:32 second update
(1 row affected)
2021-03-29 19:13:32 waiting for review
2021-03-29 19:14:32 committing
We can see that session 1 did both updates without any extra waits. Session 2 did the first update together with S1, but on the second update waited for the first session to commit (because it was attempting to update the same row with ID=1). S2 continued only after S1 finished its transaction. Pay attention to timestamps after the message "second update".
I then tried this example with the transaction isolation level set to REPEATABLE READ
and to SERIALIZABLE
. One transaction completed and another was aborted with this message:
Msg 1205, Level 13, State 51, Line 13 Transaction (Process ID 55) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
So, there was no double-spend under these more strict transaction isolation levels.
Session 1
2021-03-29 19:27:12 waiting to start
2021-03-29 19:27:30 began transaction
2021-03-29 19:27:30 waiting 2 sec
2021-03-29 19:27:32 first update
(1 row affected)
2021-03-29 19:27:32 second update
(1 row affected)
2021-03-29 19:27:32 waiting for review
2021-03-29 19:28:32 committing
Session 2
2021-03-29 19:27:10 waiting to start
2021-03-29 19:27:30 began transaction
2021-03-29 19:27:30 waiting 2 sec
2021-03-29 19:27:32 first update
(1 row affected)
2021-03-29 19:27:32 second update
Msg 1205, Level 13, State 51, Line 27
Transaction (Process ID 55) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Note here that a more strict transaction isolation level did not stop two transactions from reading the same row twice. The SELECT
completed in both sessions without problems. It did not stop them from updating different rows. It is only when it got to updating the same row with ID=1
, the conflict arose.
In general, it is better to try update of the ID = 1
with balance = 0
before updating the balance of another account. If you swap the UPDATE
statements, one of the transactions will be aborted sooner and there will be less work to roll back.
Or, use locking hints on SELECT
or sp_getapplock
or some other method to avoid concurrency.