I have an interrogation about the lost update transaction isolation issue.
Here is the corresponding figure:
And to quote the text accompanying the figure:
A lost update occurs if two transactions both update a data item and then the second transaction aborts, causing both changes to be lost. This occurs in systems that don’t implement concurrency control, where concurrent transactions aren’t isolated.
My interrogation relates to the above figure: why is the Tx A's change lost considering the fact that the commit of Tx A occurs before the rollback of Tx B? (The numbers indicate the order of events).
Can someone please explain?
P.S. I am quoting the Java Persistence with Hibernate Second Edition book published by Manning. (See: https://www.manning.com/books/java-persistence-with-hibernate-second-edition)
Edit: The text I quoted above together with the figure are meant to demonstrate the lost update issue. So it assumes the database has little to no isolation, hence the lost update. What I fail to understand is the order of the operations on the figure.. In other words, if the commit occurs before the rollback, then where is the problem? The rollback should not be taken into account...
The answer depends on implementation of transaction logging system of each database. There are different types of such system and if you want to go deeper in the subject, you need to elaborate these things first.
Also, the answer depends on what transaction isolation level has been taken. Again, it depends not only on particular database, but on particular instructions that were used to update database. Some databases allow to set any isolation level, even if it can make data inconsistent in case of concurrent update or rollback.
So, as you didn't provide any details on particular implementation, I can guess that your abstract database uses snapshot isolation. That means, that before any data modification, of a row, or range of rows or whole table, a copy of that data was made. The copy is initial state of modifying data.
Normally a transaction cannot be started before another transaction has not been finished, that requirement is achieved by locks. But in your example that happened and one transaction successfully modified data and another transaction rolled back.
Any transaction that cannot be committed should return incompletely modified data to initial state. For transaction B that state doesn't have transaction A changes, because isolation level is snapshot. Some databases work differently, by the way, and do not make any snapshots of initial state before modification, but save only changes in transactional log and then apply them when transaction considered committed.
So, the answer is: because initial state of transaction B didn't have transaction A changes. And that in some way is correct: rollback should always get data back to the state when transaction started.
UPDATE:
How it would look like if we would implement depicted situation in abstract programming language?
function Update(rowNumber, data){
initialState = getRowInitialState(rowNumber); // ------- 1
operationResult = updateRow(rowNumber, data);
if (operationResult == success)
commit(rowNumber); // ---------- 3
else
rollback(rowNumber, initialState)
}
function Delete(rowNumber){
initialState = getRowInitialState(rowNumber); // ---------- 2
operationResult = deleteRow(rowNumber); // <--- cause some problems
if (operationResult == success)
commit(rowNumber);
else
rollback(rowNumber, initialState) // -------- 4
}
Update(13, "aaaa");
Delete(13);
As you can see, rollback on Delete operation will reset data to initial state. Implying that there are no transaction log. Normally data wouldn't be really changed. Instead of that, in transaction log would be written information that some operation should be performed and some data should be changed. In this case, rollback really doesn't discard result of Update operation. Because the only thing it has to do is to remove record from transaction log. And if operation has succeeded, the changes from transaction log are applied to real data. But it looks like your abstract database does not have such mechanism. Or it illustrates access to transactionless database, which could be one of NoSQL databases. In this case there is no such log and synchronization should be performed on the client side.