I have a mariadb 10.3 server, and the following table (using InnoDB storage engine):
create table if not exists token (
`token` bigint unsigned not null,
`uid` smallint unsigned not null default 0,
`nham` int default 0,
`nspam` int default 0,
`timestamp` int unsigned default 0
) Engine=InnoDB;
create index token_idx1 on token(token);
create index token_idx2 on token(uid);
The token table has ~900k rows, and I'd like to execute the following query with a 2-300 numbers in the IN ( )
clause:
select token, nham, nspam from token where token in (1,2,3,4,...);
Now the problem: the query executes very slowly, and it just won't use token_idx1
:
+------+-------------+-------+------+---------------+------------+---------+-------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------------+---------+-------+--------+-------------+ | 1 | SIMPLE | token | ref | token_idx1 | token_idx1 | 2 | const | 837534 | Using where | +------+-------------+-------+------+---------------+------------+---------+-------+--------+-------------+
Since the token column is indexed, I'm surprised that explain select says that the optimizer has no interest on token_idx1
(and the query takes a long time, ~30 sec due to the full table scan).
How to fix the issue? I'm aware that I could use USE INDEX(token_idx1)
in the query, but I'd solve it without such hack.
The solution is to rewrite the query. So while a query like this sucks at performance:
select token, nham, nspam from token where token in (1,2,3,4,...);
The following query is fast as it should be (even if some token values don't exist in the table):
select token, nham, nspam from token where token=1 or token=2 or token=3 or ...;
So the issue is solved, though I still don't understand why the optimizer has a hard time for the first query.
Anyway, thank you for all of your thoughts, ideas and contributions leading me to the workaround.