Search code examples
mysqlquery-optimizationdatabase-performance

mysql - how to insert multiple records and delete remaining ones


Lets start with an example. There's a table:

CREATE TABLE  X (
 id INT(32) NOT NULL auto_increment,
 account INT(32),
 value INT(32),
 primary KEY (id)
)

What I want to achieve is to ensure that given account has specified set of values (and only these values). Basically, how to do below efficiently:

delete from X where account = 5;
insert into X (account, value) values 
(5, 0), 
(5, 2), 
(5, 3)
...
(5, 99);

The caveat is that usually just one value changes (appears or disappears). Pretty much I have a set of values which changes, but instead of receiving deltas I'm getting whole set I need to efficiently reflect the difference. There will not be more than 100 values for given account at any time and usually there will be just 2-3. The changes happen thousands times per second. Changes to the same account are rare (usually just several, but could be more occasionally).

What I thought about is to set id account*1000+sequence_id to increase data locality for rows with the same account. Also instead of delete+reinsert I can do (pseudocode):

$current_values = select value from X where account = 5;
$to_add = $new_values not in $current_values
$to_remove = $current_values not in $new_values
delete from X where value in $to_remove
insert into X (account, value) values (5, $to_add[0]), (5, $to_add[1])...

How can I do it better?


Solution

    1. Add INDEX(account) to X.

    2. Collect the incoming information in a temporary table. tmp.

    3. DELETE X FROM X, ( SELECT DISTINCT account FROM tmp ) y WHERE x.account=y.account; [Check the syntax, and test before going into production.]

    4. INSERT INTO X (account, value) SELECT (account, value) FROM tmp;

    If you are using InnoDB , then I recommend BEGIN; DELETE...; INSERT...; COMMIT;. This will keep other connections from finding rows deleted that are about to be re-inserted.

    If that is too slow; let's talk further.

    An aside: INT(32) -- the "(32)" means nothing. An INT is a 4-byte integer regardless of the value after it.