Search code examples
mysqlinnodbmvcc

MVCC - The atomicity problem of consistent reads


Reading and writing to the same tuple at the same time may cause read logic exceptions because of the non-atomic action of writing to the override of the tuple.

For MVCC in MySql,

Conceptually, because of ReadView, access to the tuple being written can be avoided by visibility rules, and then the race between reads and writes occurring in the same region can be avoided

But in terms of implementation details, I still have a question: Overwriting a field of a tuple will replace the data in the heap. If a read operation comes in, the same area will be read and written, which may result in read and write conflicts (byte copying is not atomic).

How can I avoid this read-write conflict? Is it a lock?


I did not express what I mean very well, I actually want to express the problem of lock competition:

Insert into tableA(age,num) values(1,1) Assume that there is a piece of data in the database.

At this time, under the read commit, two of the following transactions run concurrently at the same time: Transaction 1: select * from tableA Transaction 2: update tableA set age=2

The steps they run in the database are as follows:

  1. Transaction 1 accesses the only piece of data in the page: the transaction id that accesses the row of data, and judges that the data is visible through the visibility rule
  2. Transaction 2 locates the row of data and finds that the written age field is the same as the size occupied by the current data, so it starts to execute the replacement logic
  3. Transaction 2 copies the value of the age field in the current data to undo, then points the undo pointer to the past, and the transaction id is updated
  4. Transaction 2 writes the value 2 back to the age field of the current data
  5. Transaction 1 starts to access the age field, reads that the current data value is 2, and accesses the value of the num field to 1, and returns (2, 1)
  6. Transaction 2 commit

Through the operation of the above steps, you can see that the result returned by transaction 1 is not expected, and the root cause of this is that the action of reading the transaction id and the action of writing data (undo, transaction id, new data) are not mutually exclusive

mvcc is an access design, but the database engine will still compete for the same area when accessing and writing tuples. How does Innodb cleverly avoid read-write conflicts when implementing mvcc?


Solution

  • The first thing to realize is that there may be multiple versions of a row in existence at any time.

    Think of the processing working something like this. When a transaction starts, a copy of the entire database is taken. The transaction sees only those rows. When the next transaction comes along, it gets its own snapshot.

    There aren't really full copies; instead, each row (or version of a row) has a sequential transaction-number associated with it. And each transaction can see whether someone else has a row locked. There are (at least) two flavors of lock -- read and read-write. There are various rules about how much each blocks the other.

    Back to your question... A row is the unit of activity. When a row is to be touched, it must see what else is happening with that row. A simple Select might get away with a 'shared read lock', but an Insert/Update/Delete needs a stronger lock.

    Things are atomic. But the locking is at the row level, with transaction ids, history list (pending versions of each row), etc.

    And deadlocks are discovered early; one transaction is picked to be rolled back. Some other conflicts are resolved by stalling one transaction until another is finished (cf innodb_lock_wait_timeout).

    If I try to read a row and there is a not-yet-committed write pending to that row, I probably cannot see the pending write; instead, I see an older copy of the row. (cf transaction isolation modes)

    If by "a field of a tuple" you mean "a column of a row", then I go back to "the row is the unit of locking and ACID.

    Sorry if I have not been clear. I hope that I have given you some clues and mentioned some things to research.

    Versions

    (A simplistic discussion of Versioning) Each version is tied to a transaction. If the transaction is ROLLBACK'd, then that version is no longer of any use. For a COMMIT, it will become the winner and others will be not needed. This "cleanup" occurs after the client's query has "finished".

    Each Version has a transaction-id that is monotonic. This "history list" keeps track of the potential winners/loosers. And it lets a transaction see only the version of the row that the "transaction isolation" allows it to see. Eg, "READ COMMITTED" shows you an old version, not a not-yet-COMMITted version. "READ UNCOMMITTED" ("dirty read") shows you the latter.