I have two query for same output
Slow Query:
SELECT
*
FROM
account_range
WHERE
is_active = 1 AND '8033576667466317' BETWEEN range_start AND range_end;
Execution Time: ~800 ms.
Explain:
+----+-------------+---------------+------------+------+-------------------------------------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------+------------+------+-------------------------------------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | account_range | NULL | ALL | range_start,range_end,range_se_active_idx | NULL | NULL | NULL | 940712 | 2.24 | Using where |
+----+-------------+---------------+------------+------+-------------------------------------------+------+---------+------+--------+----------+-------------+
Very Fast Query: learnt from here
SELECT
*
FROM
account_range
WHERE
is_active = 1 AND
range_start = (SELECT
MAX(range_start)
FROM
account_range
WHERE
range_start <= '8033576667466317') AND
range_end = (SELECT
MIN(range_end)
FROM
account_range
WHERE
range_end >= '8033576667466317')
Execution Time: ~1ms
Explain:
+----+-------------+---------------+------------+------+-------------------------------------------+---------------------+---------+-------------------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------+------------+------+-------------------------------------------+---------------------+---------+-------------------+------+----------+------------------------------+
| 1 | PRIMARY | account_range | NULL | ref | range_start,range_end,range_se_active_idx | range_se_active_idx | 125 | const,const,const | 1 | 100.00 | NULL |
| 3 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 2 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+---------------+------------+------+-------------------------------------------+---------------------+---------+-------------------+------+----------+------------------------------+
Table Structure:
CREATE TABLE account_range (
id int(11) unsigned NOT NULL AUTO_INCREMENT,
range_start varchar(20) NOT NULL,
range_end varchar(20) NOT NULL,
is_active tinyint(1) NOT NULL,
bank_name varchar(100) DEFAULT NULL,
addedon timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
updatedon timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
description text,
PRIMARY KEY (id),
KEY range_start (range_start),
KEY range_end (range_end),
KEY range_se_active_idx (range_start , range_end , is_active)
) ENGINE=InnoDB AUTO_INCREMENT=946132 DEFAULT CHARSET=utf8;
Please do explain Why doesn't MySql automatically optimizes BETWEEN query?
Update:
Realised my mistake from @kordirko answer. My table contains only non-overlapping
ranges, so both queries are returning same results.
Such a comparision doesn't make sense, since you are comparing apples to oranges.
These two queries are not eqivalent, they give different resuts,
thus MySql optimises them in a different way and their plans can differ.
See this simple example: http://sqlfiddle.com/#!9/98678/2
create table account_range(
is_active int,
range_start int,
range_end int
);
insert into account_range values
(1,-20,100), (1,10,30);
First query gives 2 rows:
select * from account_range
where is_active = 1 and 25 between range_start AND range_end;
| is_active | range_start | range_end |
|-----------|-------------|-----------|
| 1 | -20 | 100 |
| 1 | 10 | 30 |
Second query gives only 1 row:
SELECT * FROM account_range
WHERE
is_active = 1 AND
range_start = (SELECT MAX(range_start)
FROM account_range
WHERE range_start <= 25
) AND
range_end = (SELECT MIN(range_end)
FROM account_range
WHERE range_end >= 25
)
| is_active | range_start | range_end |
|-----------|-------------|-----------|
| 1 | 10 | 30 |
To speed this query up (the first one), two bitmap indexes can be used together with "bitmap and" operation - but MySql doesn't have such a feature.
Another option is a spatial index (for example GIN indexes in PostgreSql: http://www.postgresql.org/docs/current/static/textsearch-indexes.html).
And another option is a star transformation (or a star schema) - you need to "divide" this table into two "dimension" or "measures" tables and one "fact" table .... but this is too broad topic, if you want know more you can start from here: https://en.wikipedia.org/wiki/Star_schema