Search code examples
mysqlinsertduplicatesunique

ON DUPLICATE KEY adding rows


OK so I have the following table:

CREATE TABLE `Tags` (
  `tag_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `tag` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`tag_id`),
  UNIQUE KEY `tag` (`tag`),
  KEY `tag_id` (`tag_id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
SET FOREIGN_KEY_CHECKS=1;

When I run the following query, it seems to be increasing the primary key even if it finds duplicates (even though it doesn't add the duplicate).

INSERT INTO Tags (tag) VALUES ('book'),('cats'),('dogs') ON DUPLICATE KEY UPDATE tag = tag

You get...

tag_id | tag
==============================
1      | book
2      | robots
3      | doodles
5      | cats
6      | dogs
==============================

How do I prevent this?


Solution

  • You can't really prevent this, and you shouldn't worry about it. Gaps in the auto-incremented id are almost never a problem.

    What is happening is that MySQL is attempting to insert the row. In order to insert the row, it has to construct it -- hence the auto-incremented column is incremented. When the row is already found, then the insert fails, but the auto-increment is already incremented.

    If you want to minimize the problem, you can attempt to find the duplicates before attempting the insert:

    INSERT INTO Tags (tag) 
        SELECT t.tag
        FROM (SELECT 'book' as tag UNION ALL
              SELECT 'cats' as tag UNION ALL
              SELECT 'dogs' as tag
             ) t
        WHERE NOT EXISTS (SELECT 1 FROM Tags t2 WHERE t2.tag = t.tag)
        ON DUPLICATE KEY UPDATE tag = VALUES(tag);
    

    This is not a 100% fix -- the subquery could have duplicates or another process could update the table while this is running. However, it will generally avoid inserting records when they shouldn't be inserted.