I have a MySQL database table with 15 columns. It has 2 indexes; a primary which is an int (auto increment), and a compound unique index across 4 columns, int, int, int and datetime.
The table currently has 5.5M entries, and each new multi INSERT (1.2M rows) is taking less than 2 minutes when importing a SQL data file using the following method...
mysql -h localhost -P 3306 -u test_user -D my_database -C < data.sql
However, when I try to perform a similar process to the above, but using a multi UPDATE (only 2,000 rows!) using a single statement and only updating 1 column, e.g...
UPDATE table SET col15 = (CASE WHEN col2=x AND col3=y AND col4=z and col5=d THEN 'a' ... WHEN col2=x AND col3=y AND col4=z and col5=d THEN 'b' END);
...it is taking over 9 minutes (NOTE: the columns in the WHEN clause are the columns that make up the unique index).
This doesn't sound right to me. I appreciate that using a unique index will be slower than using the primary, but there is still a huge difference in performance between these 2 results!
Any thoughts?
True.
Updating 2000 rows requires:
WHERE
clause, all 5.5M rows will be checked.CASE
.col5
.9 minutes seem high, but you have a combination of checking 5.5M rows and taking extra time on 2000 of them.
Assuming you really meant for testing col2/3/4 the same way, but not col5, I suggest this would be much better:
UPDATE table SET col15 = ( CASE
WHEN col5=d THEN 'a' ...
WHEN col5=e THEN 'b' END )
WHERE col2=x
AND col3=y
AND col4=z
together with
INDEX(col2, col3, col4) -- in any order
The important thing is to have a WHERE
that filters. If there is nothing common between the two WHENs, the it will be faster to use two UPDATEs
. If you need them to happen 'simultaneously', then put them in a transaction (BEGIN...COMMIT
) to make the two Updates "atomic".