MySQL document (https://dev.mysql.com/doc/refman/8.0/en/innodb-locks-set.html) mentioned,
If a duplicate-key error occurs, a shared lock on the duplicate index record is set. This use of a shared lock can result in deadlock should there be multiple sessions trying to insert the same row if another session already has an exclusive lock. ...
...
INSERT ... ON DUPLICATE KEY UPDATE differs from a simple INSERT in that an exclusive lock rather than a shared lock is placed on the row to be updated when a duplicate-key error occurs.
and I've read the source code(https://github.com/mysql/mysql-server/blob/f8cdce86448a211511e8a039c62580ae16cb96f5/storage/innobase/row/row0ins.cc#L1930) that corresponding this situation, InnoDB indeed set the S or X lock when a duplicate-key error occurs.
if (flags & BTR_NO_LOCKING_FLAG) {
/* Set no locks when applying log
in online table rebuild. */
} else if (allow_duplicates) {
... ...
/* If the SQL-query will update or replace duplicate key we will take
X-lock for duplicates ( REPLACE, LOAD DATAFILE REPLACE, INSERT ON
DUPLICATE KEY UPDATE). */
err = row_ins_set_rec_lock(LOCK_X, lock_type, block, rec, index, offsets, thr);
} else {
... ...
err = row_ins_set_rec_lock(LOCK_S, lock_type, block, rec, index, offsets, thr);
}
But I wonder why InnoDB has to set such locks, it seems that these locks will bring more problems than they solve(they solved this problem: MySQL duplicate key error causes a shared lock set on the duplicate index record?).
Firstly, it can result in deadlock easily, the same MySQL document showed 2 examples about the deadlock.
Worse, the S or X lock is not single index-record lock, it is Next Key lock and may refuse many values to be inserted rather than just one duplicated value.
e.g.
CREATE TABLE `t` (
`id` int NOT NULL AUTO_INCREMENT,
`c` int DEFAULT NULL,
`d` int DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uniq_idx_c` (`c`)
) ENGINE=InnoDB AUTO_INCREMENT=48 DEFAULT CHARSET=utf8mb4
mysql> select * from t;
+----+------+------+
| id | c | d |
+----+------+------+
| 30 | 10 | 10 |
| 36 | 100 | 100 |
+----+------+------+
mysql> show variables like '%iso%';
+-----------------------+-----------------+
| Variable_name | Value |
+-----------------------+-----------------+
| transaction_isolation | REPEATABLE-READ |
+-----------------------+-----------------+
1 row in set (0.41 sec)
# Transaction 1
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t values (null, 100, 100);
ERROR 1062 (23000): Duplicate entry '100' for key 't.uniq_idx_c'
# not commit
# Transcation 2
mysql> insert into t values(null, 95, 95);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t values(null, 20, 20);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t values(null, 50, 50);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
# All c in [10, 100] can not be inserted
The goal in an ACID database is that queries in your session have the same result if you try to run them again.
Example: You run an INSERT query that results in a duplicate key error. You would expect if you retry that INSERT query, it would again fail with the same error.
But what if another session updates the row that caused the conflict, and changes the unique value? Then if you retry your INSERT, it would succeed, which is unexpected.
InnoDB has no way to implement true REPEATABLE-READ transactions when your statements are locking. E.g. INSERT/UPDATE/DELETE, or even SELECT with the locking options FOR UPDATE, FOR SHARE, or LOCK IN SHARE MODE. Locking SQL statements in InnoDB always act on the latest committed version of a row, not the version of that row that is visible to your session.
So how can InnoDB simulate REPEATABLE-READ, ensuring that the row affected by a locking statement is the same as the latest committed row?
By locking rows that are indirectly referenced by your locking statement, preventing them from being changed by other concurrent sessions.