I read about database isolation levels and transactional phenomena. Dirty reads are obvious but I don't understand dirty writes.
All descriptions of dirty write say something like:
A dirty write is when a process save[s to] a file data that has already been changed on disk by another process. The last process will then overwrite the data of the first one. https://gerardnico.com/data/property/dirty_write
Some other descriptions use examples to demonstrate a dirty write but not what will happen to solve that. https://esb-dev.github.io/mat/IsoLevel.pdf
This is a dirty write from the example:
- Saldo starts with 100
- T2 begins: update Acct set Saldo = 200 where Acct = 1
- T1 begins: update Acct set Saldo = 250 where Acct = 1
- T1 commits => Saldo=250
- T2 commits => Saldo=200
I don't know what will happen when the isolation level does not allow dirty writes.
I don't know what I should expect from transaction management. The example, modified:
Do we have a dirty write here? And what is the result if dirty write is allowed?
I have an additional question about Java/Spring JPA/Hibernate: Are write statements not sent to the database unless hibernate does a commit? Some isolation levels and phenomena only make sense if all statements are always transmitted instantly to the database.
In general transactions prevent dirty writes by locking rows that will be written to until the end of the transaction. This means that neither transaction will fail, it's just that the second write will be delayed until the first transaction is finished. I don't think there are any isolation levels that don't do this.
So when you think about dirty writes you can imagine not having any isolation. In your first example I would expect that Saldo's final account to be 250 because the update to 200 was overwritten.
Note that it's not necessarily a problem for one transaction to overwrite the write of another transaction. Even if we imagine that the two transactions ran serially (i.e. everything in the first transaction completes before the second transaction starts) and therefore completely isolated we still expect one transaction to overwrite the other transaction.
I think a better example of a dirty write would be if transaction A and B both try to write to rows 1 and 2, but transaction A writes last to row 1 and transaction B writes last to row 2. Then you will see a result that doesn't match our expectation of isolation.
For a concrete example you can imagine you have two rows representing first and second place of a race. The first transaction set racer 1 in first and racer 2 in second place. Moments later a second transaction wants to update to put racer 2 in first place and racer 1 in second place. If dirty writes are possible then you could end up with racer 1 (or racer 2) in both first and second place slots.
As I mentioned already you can fix this problem by using transaction of any isolation level which will lock rows that are being written to from being written to by any other transactions until the changes are committed.