Search code examples
mysqlsqlinnodbmariadb

Force hidden clustered index in innoDB


I have a table with IDs that are a hash of the "true primary key". Correct me if I'm wrong, but I think my inserts are very slow in this table because of the clustered index on this key (it takes multiple minutes for inserting 100 000 rows). When I change the key to a nonclustered index, I have the impression that innoDB still secretly clusters on it.

Is there a simple way to avoid that mysql clusters on my primary key without having to define an auto increment primary key?


Solution

  • InnoDB must have a PRIMARY KEY.

    1. Innodb's first preference is an explicit PRIMARY KEY, whether AUTO_INCREMENT or not.
    2. Then a UNIQUE key, but only if none of the columns are NULLable.
    3. Finally, InnoDB will create a hidden, 6-byte, integer that acts somewhat like an auto_increment.

    Scenario 1. Inserting into a table must find the block where the desired primary key is. For AUTO_INCREMENT and for #3, above, that will be the "last" block in the table. The 100K rows will go into about 1000 blocks at the "end" of the table.

    Scenario 2. Otherwise (non-AI, but explicit PK; or UNIQUE), a block needs to be found (possibly read from disk), the key checked for dup, then the block updated and marked for later rewriting to disk.

    If all the blocks fit in the buffer_pool, then either of those is essentially the same speed. But if the table is too big to be cached, then Scenario 2 becomes slow -- in fact slower and slower as the table grows. This is because of I/O. GUIDs, UUIDs, MD5s, and other hashes are notorious at suffering from this slow-down.

    Another issue: Transaction integrity dictates that each transaction incur some other I/O. Is your 100K inserts 100K transactions? 1 transaction? Best is to batch them in groups of 100 to 1000 rows per transaction.

    I hope those principles let you figure out your situation. If not, please provide CREATE TABLE for each of the options you are thinking about. Then we can discuss your details. Also provide SHOW VARIABLES LIKE 'innodb_buffer_pool_size'; and how much RAM you have.