Lets start with an example. There's a table:
CREATE TABLE X (
id INT(32) NOT NULL auto_increment,
account INT(32),
value INT(32),
primary KEY (id)
)
What I want to achieve is to ensure that given account has specified set of values (and only these values). Basically, how to do below efficiently:
delete from X where account = 5;
insert into X (account, value) values
(5, 0),
(5, 2),
(5, 3)
...
(5, 99);
The caveat is that usually just one value changes (appears or disappears). Pretty much I have a set of values which changes, but instead of receiving deltas I'm getting whole set I need to efficiently reflect the difference. There will not be more than 100 values for given account at any time and usually there will be just 2-3. The changes happen thousands times per second. Changes to the same account are rare (usually just several, but could be more occasionally).
What I thought about is to set id account*1000+sequence_id to increase data locality for rows with the same account. Also instead of delete+reinsert I can do (pseudocode):
$current_values = select value from X where account = 5;
$to_add = $new_values not in $current_values
$to_remove = $current_values not in $new_values
delete from X where value in $to_remove
insert into X (account, value) values (5, $to_add[0]), (5, $to_add[1])...
How can I do it better?
Add INDEX(account)
to X.
Collect the incoming information in a temporary table. tmp
.
DELETE X FROM X, ( SELECT DISTINCT account FROM tmp ) y WHERE x.account=y.account;
[Check the syntax, and test before going into production.]
INSERT INTO X (account, value) SELECT (account, value) FROM tmp;
If you are using InnoDB , then I recommend BEGIN; DELETE...; INSERT...; COMMIT;
. This will keep other connections from finding rows deleted that are about to be re-inserted.
If that is too slow; let's talk further.
An aside: INT(32)
-- the "(32)" means nothing. An INT is a 4-byte integer regardless of the value after it.