Search code examples
mysqlmariadbinnodb

Consistent InnoDB dumps with mysqldump


The MySQL documentation recommends that when dumping an InnoDB with mysqldump to use --single-transaction to get a transactionally consistent snapshot of the database with minimal blocking of other transactions. Unfortunately, this only works if you can somehow prevent any other connection from executing a schema change during the dump:

https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html

While a --single-transaction dump is in process, to ensure a valid dump file (correct table contents and binary log coordinates), no other connection should use the following statements: ALTER TABLE, CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE. A consistent read is not isolated from those statements, so use of them on a table to be dumped can cause the SELECT that is performed by mysqldump to retrieve the table contents to obtain incorrect contents or fail.

What is a good way of ensuring that no schema changes happen during a dump? My use case is that I periodically (cronjob) back up my database every hour using the mysqldump utility.

I thought that maybe using mysqldump's default of --lock-tables instead would give me transactional consistency, but:

The --single-transaction option and the --lock-tables option are mutually exclusive because LOCK TABLES causes any pending transactions to be committed implicitly.

Is that saying that if someone runs a mysqldump --lock-tables against an InnoDB that it can actually cause an ongoing transaction in another session to no longer be transactional? I hope not -- that seems insane. Elsewhere:

https://dev.mysql.com/doc/refman/8.0/en/lock-tables.html#lock-tables-and-transactions

LOCK TABLES and UNLOCK TABLES interact with the use of transactions as follows:

LOCK TABLES is not transaction-safe and implicitly commits any active transaction before attempting to lock the tables.

This makes sense within the context of a single DB session. If you are in the middle of transaction and do this, then it implicitly commits your open transaction first. It isn't clear what it does, if anything, to transactions on other sessions. I'm hoping that it wouldn't do anything, or rather that it would be consistently ordered with respect to any ongoing transactions on other sessions.

Am I just reading too much into the mysqldump docs and using --lock-tables achieves transactional consistency?

Are the mysqldump docs simply explaining that doing --single-transaction together with --lock-tables is nonsensical because --lock-tables would immediately and implicitly close the single transaction? But otherwise mysqldump --lock-tables is transactionally consistent?

Thanks!


Solution

  • Lock tables forces a commit of the transaction in the same session. It doesn't force commit in other sessions.

    If you used both options with mysqldump, the session opened by the mysqldump client would start a transaction, and then immediately commit that transaction when the lock-tables happened. So there's no point to using them together.

    Furthermore, any other clients' transactions would not be committed, but they would block mysqldump's lock-tables operation. Any query (even read-only SELECT) will acquire a metadata lock, and lock-tables will wait for that metadata lock to be released. So the mysqldump would try to acquire the table locks, and have to wait.

    Meanwhile, any clients that try to start a new transaction after mysqldump begins waiting would also wait, as though they were in a queue behind mysqldump. This is bound to halt your application's progress.

    This type of wait isn't like a regular row-lock that times out after 50 seconds. The default timeout for metadata lock waits is 1 year (see https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_lock_wait_timeout).

    This pretty much rules out using mysqldump --lock-tables on a production site, doesn't it? It should only be used if you don't use transactional tables. I haven't used the --lock-tables option in years.

    If you use --lock-tables, any DDL statement is blocked because the DDL statement needs an exclusive metadata lock too.

    I don't know of any way to block DDL statements when using --single-transaction. You just have to refrain from doing that while backups are running.

    I would also comment that making a complete mysqldump every hour might not be the best backup strategy. You should consider making less frequent backups, and then use the binary log to get point-in-time recovery. See https://dev.mysql.com/doc/refman/8.0/en/point-in-time-recovery.html