Search code examples
sqlsqlitequery-optimization

SQLite3 performance with OR of BETWEEN ranges


I have a quite big SQLite3 database table with a numeric indexed field on which I have to search for a list of ranges of values. As the numeric values are huge 64-bit numbers, a IN clause would not an option. The queries typically look like this:

SELECT * FROM sometable WHERE ID BETWEEN 10 AND 11 
                           OR ID BETWEEN 20 AND 21 
                           OR ID BETWEEN 30 AND 31;

I have experienced a strange performance limit. With up to 9 BETWEEN terms, the query is extremely fast (ID field is indexed). But starting with 10 terms, the query becomes several orders or magnitude slower! I do not have found any explanation to that limit in the documentation.

I found that the EXPLAIN QUERY PLAN instruction can be used to see the change of behavior. I made my experiments with SQLite 3.7.12 in case that matters.

For the sake of demonstration, let's create a very simple and empty table:

CREATE TABLE sometable(name TEXT, ID INTEGER);
CREATE INDEX id_idx ON sometable (ID ASC);

This query:

EXPLAIN QUERY PLAN SELECT * FROM sometable WHERE ID BETWEEN 10 AND 11 
 OR ID BETWEEN 20 AND 21 OR ID BETWEEN 30 AND 31 OR ID BETWEEN 40 AND 41
 OR ID BETWEEN 50 AND 51 OR ID BETWEEN 60 AND 61 OR ID BETWEEN 70 AND 71
 OR ID BETWEEN 80 AND 81 OR ID BETWEEN 90 AND 91;     

produces that result:

0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)
0|0|0|SEARCH TABLE sometable USING INDEX id_idx (ID>? AND ID<?) (~31250 rows)

While that query:

EXPLAIN QUERY PLAN SELECT * FROM sometable WHERE ID BETWEEN 10 AND 11 
 OR ID BETWEEN 20 AND 21 OR ID BETWEEN 30 AND 31 OR ID BETWEEN 40 AND 41
 OR ID BETWEEN 50 AND 51 OR ID BETWEEN 60 AND 61 OR ID BETWEEN 70 AND 71
 OR ID BETWEEN 80 AND 81 OR ID BETWEEN 90 AND 91 OR ID BETWEEN 100 AND 101;

produces that result:

0|0|0|SCAN TABLE sometable (~500000 rows)

SCAN TABLE means that the index is not used and the whole original table is searched, resulting in poor performance.

Is there a way (pragma / compilation switch / trick) to avoid that limit?


Solution

  • As you can see, SQLite tries to split up the query into multiple subqueries so that each range can be looked up individually in the index.

    However, when there are too many ranges, the query optimizer assumes that the sum of the cost of all the individual subqueries is larger than just going once through the table.

    If your ranges contain less than 31250 rows, or if your table has more than 1000000 rows, you can try to use the ANALYZE command to improve the cost estimates.

    As a last resort, you can split up the query manually to force separate lookups:

    SELECT * FROM sometable WHERE ID BETWEEN 10 AND 11
    UNION ALL
    SELECT * FROM sometable WHERE ID BETWEEN 20 AND 21
    UNION ALL
    SELECT * FROM sometable WHERE ID BETWEEN 30 AND 31 
    ...