Search code examples
mysqldatabasetransactionsdata-integrity

MySQL transactions: multiple concurrent transactions and data integrity


I'm using transactions for managing data across several MySQL InnoDB tables in a reasonably complex web application. Briefly, a given transaction works as follows:

  1. Data is read from a row in a "user_point_totals" table
  2. Various machinations calculate what the user's new point total should be
  3. A new entry is created in the "user_point_totals" table reflecting the updated total

Let's say that user A performs some action that has point-related ramifications, step 1 is executed, that thread of execution reads the user's point total into memory, and the application begins calculating the new total. Meanwhile, user B performs an action that has implications for user A's point total, and another transaction begins; however, the first transaction has not yet completed, so the second thread gets the same point total value as a starting point as the first transaction (from the same table row). Subsequently, transaction 1 completes and creates a new user point total with its perception of what the new value should be, and shortly thereafter, transaction 2 completes and creates a new row for the user's point total as well. However, the second transaction's point total is now incorrect, as it fails to account for the new total created by transaction 1.

My questions are:

  • Is this scenario impossible due to the atomic nature of transactions, which I apparently don't understand as well as I should?
  • If not, how does one ensure that data integrity exists in these sorts of situations?

Thanks for your consideration!


Solution

  • On the technical level, you could use the table-locking (or row-locking) abilities of MySQL. That would allow you to detect if someone is actually computing something using the table. On the other hand, this technique would require many considerations like what happens if a process crash, etc.

    On the practical level, though, I doubt you would want to do something like this. Operators like sum() or avg() in MySQL are already optimised for this need. If what you need to do is a sum over some columns of a table and get the answer in a table, you could use a view or create a temporary table (possible but slower). You should not have a column that contains a value that could be computed from other columns. This situation leads to incongruities since it is unclear which field is true if the relation doesn't balance. (Is it an input script error or a deliberate re-use of the field by some programmer?)

    On a second note, be sure to use InnoDB tables on your MySQL instance, otherwise your system won't be fully ACID-compilant, meaning you won't get the atomicity nature you need.