Search code examples
mysqlinnodbdatabase-locking

How to achieve database locking based protection against duplicates


In a web application, using the InnoDB storage engine, I was unable to adequately utilise database locking in the following scenario.

There are 3 tables, I will call them aa, ar and ai.

aa holds the base records, let's say articles. ar holds information related to each aa record and the relation between aa and ar is 1:m.

Records in ar are stored when a record from aa is read the first time. The problem is that when two requests are initiated at (nearly) the same to read a record from aa (which does not yet have its related records stored in ar), the ar records are duplicated.

Here is a pseudo code to help understand the situation:

  • Read the requested aa record.

  • Scan the ar table to find out if the given aa record has anything stored already. (Assume it has not.)

  • Consult ai in order to find out what is to be stored in ar for the given aa record. (ai seems somewhat irrelevant, but I found that it too has to be involved in the locking… may be wrong.)

  • Insert a few rows to ar

Here is what I want to achieve:

  • Read the requested aa record.

WITH OR WITHOUT USING A TRANSACTIONS, LOCK ar, SO ANY SUBSEQUENT REQUEST ATTEMPTING TO READ FROM ar WILL WAIT AT THIS POINT UNTIL THIS ONE FINISHES.

  • Scan the ar table to find out if the given aa record has anything stored already. (Assume it has not.) The problem is that in case of two simultaneous requests, both find there are no records in ar for the given aa record and they both proceed to insert the same rows twice. Otherwise, if there are, this sequence is interrupted and no INSERT occurs.

  • Consult ai in order to find out what is to be stored in ar for the given aa record. (ai seems somewhat irrelevant, but I found that it too has to be involved in the locking… may be wrong.)

  • Insert a few rows to ar

RELEASE THE LOCK ON ar

Seems simple enough, I was unsuccessful in avoiding the duplicates. I'm testing the simultaneous requests from a simple command in a Bash shell (using wget).

I have spent a while learning how exactly locking works with the InnoDB engine here http://dev.mysql.com/doc/refman/5.5/en/innodb-lock-modes.html and here http://dev.mysql.com/doc/refman/5.5/en/innodb-locking-reads.html and tried several ways to utilise the lock(s), still no luck.

I want the entire ar table locked (since I want to prevent INSERTs from multiple request to occur to it) causing further attempts to interact with this table to wait until the first lock is released. But there's only one mention of "entire table" being locked in the documentation (Intention Locks section in the first linked page) but that's not further discussed or I was unable to figure how to achieve it.

Could anyone point in the right direction?


Solution

  • SET tx_isolation='READ-COMMITTED';
    START TRANSACTION;
    SELECT * FROM aa WHERE id = 1234 FOR UPDATE;
    

    This ensures that only one thread gets access to a given row in aa at a time. No need to lock the ar table at all, because any other thread who may want to access row 1234 will wait.

    Then query ar to find out what rows exist for the corresponding aa, and decide if you want to insert more rows to ar.

    Remember that the row in aa is still locked. So be a good citizen by finishing your work quickly, and COMMIT promptly.

    COMMIT;
    

    This allows the next thread who has been waiting for the same row of aa to proceed. By using READ-COMMITTED, it will be able to see the just-committed new rows in ar.