Search code examples
mysqlinnodb

MySQL Transaction in Cron job


I have a PHP DAEMON on my Ubuntu server doing huge data inserts into InnoDB. The very same tables are also being used by people using the platform.

The DAEMON when not running in TRANSACTION mode uses about 60-70 secs for 100.000 inserts. When running in TRANSACTION mode, BEGIN .... COMMIT it uses 15-20 seconds.

However will TRANSACTION mode lock the tables, and prevent users using the platform to do inserts while the DAEMON TRANSACTION is beeing preformed? Locking the tables the users are manipulating for over 20 seconds is, of course, not desirable :)

Well I'm doing inserts in batches of 500 and 500 insie a FOR loop INSERT INTO (col1, col2) VALUES (a,b) etc. This is fine, and runs smooth, however I'm able to speed up the process significantly if i issue a BEGIN before the loop, and COMMIT after to loop, but this means the time between the BEGIN/COMMIT is over 60 seconds. But while the system is doing a few hundred thousand inserts, people using the platform can do inserts to the very same table. Will the system generated Inserts account for the user insets, or will the users have to wait XX seconds before their insert is processed?


Solution

  • Based on your description, you use innodb with the default autocommit mode enabled and you insert records one by one in a loop. Autocommit mode means that each insert is encapsulated into its own transaction, which is fine, but very slow, since each record is persisted separately into the disk.

    If you wrap your loop that inserts the records within begin - commit statements, all inserts are run within a single transaction and are persisted to the disk only once, when the commit is issued - this is why you experience the speed gain.

    Regardless of which way you insert the records, innodb will use locks. However, innodb only locks the record being inserted:

    INSERT sets an exclusive lock on the inserted row. This lock is an index-record lock, not a next-key lock (that is, there is no gap lock) and does not prevent other sessions from inserting into the gap before the inserted row.

    Prior to inserting the row, a type of gap lock called an insert intention gap lock is set. This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap. Suppose that there are index records with values of 4 and 7. Separate transactions that attempt to insert values of 5 and 6 each lock the gap between 4 and 7 with insert intention locks prior to obtaining the exclusive lock on the inserted row, but do not block each other because the rows are nonconflicting.

    This means, that having a transaction open for a longer period of time that only inserts records will not interfere with other users inserting records into the same table.

    Pls note, that issuing single insert statements in a loop is the least efficient way of inserting larger amount of data into MySQL.

    Either use bulk insert (build a single insert statement in the loop and execute it after the loop, paying attention to max_allowed_packet setting :

    INSERT statements that use VALUES syntax can insert multiple rows. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas. Example:

    INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
    

    Or use load data infile statement.

    These two solutions can significantly speed up the data insertion and will not cause table lock either.