I am building a star schema to act as the backend for an analytics app I am building. My query generator is building queries using a regular star-join pattern. A sample query is below, whereby a fact table is joined to two dimension tables and the dimension tables are filtered by constant values chosen by the end user.
I am using MySQL 5.5 and all tables are MyISAM.
In this problem, I am simply trying to pull the first N rows (in this case, the first 1 row)
EXPLAIN
SELECT fact_table.*
FROM
fact_table
INNER JOIN
dim1 ON (fact_table.dim1_key = dim1.pkey)
INNER JOIN
dim2 ON (fact_table.dim2_key = dim2.pkey)
WHERE
dim1.constant_value = 123
AND dim2.constant_value = 456
ORDER BY
measure1 ASC LIMIT 1
The explain output follows. Both the dimension keys resolve to constant values since there is a unique key applied to their value.
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dim1
type: const
possible_keys: PRIMARY,dim1_uk
key: dim1_uk
key_len: 8
ref: const
rows: 1
Extra: Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: dim2
type: const
possible_keys: PRIMARY,dim2_uk
key: dim2_uk
key_len: 8
ref: const
rows: 1
Extra:
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: fact_table
type: ref
possible_keys: my_idx
key: my_idx
key_len: 16
ref: const,const
rows: 50010
Extra: Using where
And here is the index on the fact table:
show indexes from fact_table
*************************** 10. row ***************************
Table: fact_table
Non_unique: 1
Key_name: my_idx
Seq_in_index: 1
Column_name: dim1_key
Collation: A
Cardinality: 24
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
*************************** 11. row ***************************
Table: fact_table
Non_unique: 1
Key_name: my_idx
Seq_in_index: 2
Column_name: dim2_key
Collation: A
Cardinality: 70
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
*************************** 12. row ***************************
Table: fact_table
Non_unique: 1
Key_name: my_idx
Seq_in_index: 3
Column_name: measure1
Collation: A
Cardinality: 5643
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
When profiling this query, I see the query spends the majority of its time performing a filesort operation "sorting result". My question is, even when using the correct index, why can't this query simply pull out the first value without doing a sort? The my_idx is already sorted on the right column and the two columns appearing first in the index resolve as constants, as shown in the plan.
If I rewrite the query, as follows, I am able to get the plan I want, with no file sorting.
SELECT fact_table.*
FROM
fact_table
WHERE
dim1_key = (select pkey from dim1 where constant_value = 123)
AND dim2_key = (select pkey from dim2 where constant_value = 456)
ORDER BY
measure1 ASC LIMIT 1
It would be expensive to change the tool generating these SQL commands so I would like to avoid this filesort even when the query is written in the original format.
My question is, why is MySQL opting to do a filesort even when the first keys on the index are constants (via an INNER JOIN) and the index is sorted in the right order? Is there a way around this?
My question is, why is MySQL opting to do a filesort even when the first keys on the index are constants (via an INNER JOIN) and the index is sorted in the right order? Is there a way around this?
Because the order of the resultset depends on the index used for reading the first table in the JOIN, but, as you see in EXPLAIN, the JOIN actually starts from dim1
table.
It might seem strange, but to implicitly force MySQL start from fact_table
you will need to change the indexes in the dimension tables to (pkey, constantvalue) instead of (constantvalue), otherwise MySQL optimizer will start with a table for which the condition constantvalue=some_value
returns minimum rows. The problem is that you might need those indexes for other queries.
Instead, you may try to add STRAIGHT_JOIN option to the SELECT and explicitly force the order.