Search code examples
sqlmysqlquery-optimization

MySQL8 ORDER BY with JOIN uses filesort even though the result set is sorted already (based on indexes)


Table schemas:

CREATE TABLE `table1` (
  `fieldToFilterBy` int NOT NULL,
  `someKindOfInt` int NOT NULL,
  `fieldToJoinOn` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`fieldToFilterBy`,`someKindOfInt`),
  UNIQUE KEY `intexOfTable1` (`fieldToFilterBy`,`fieldToJoinOn`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
CREATE TABLE `table2` (
  `fieldToJoinOn` varchar(128) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `fieldToSortBy` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`fieldToSortBy`),
  UNIQUE KEY `indexOfTable2` (`fieldToJoinOn`,`fieldToSortBy`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

The query:

SELECT t1.fieldToFilterBy, t1.someKindOfInt, t1.fieldToJoinOn, t2.fieldToSortBy
FROM table1 t1
INNER JOIN table2 t2
ON t1.fieldToJoinOn = t2.fieldToJoinOn
WHERE t1.fieldToFilterBy = 123
ORDER BY <....>
LIMIT 100;

If for the ORDER BY <...> I use:

  1. t1.fieldToFilterBy - it uses index
  2. t2.fieldToFilterBy - the execution plan contains filesort even though the result set is already sorted and the field contains the same values as the field in the previous example
  3. t1/2.fieldToFilterBy, t2.fieldToSortBy - I'm interested in this one. The execution plan contains filesort even though the result set is already sorted (it uses the indexOfTable2 for join, taking into account how the nested loop inner join works and my own tests - it's always sorted)

I'm concerned about the filesort in the case 3 above. I found out that MySQL uses Quicksort or Merge sort for filesort, but no details about the implementation (seems like it can be very inefficient in some implementations, even if the array is sorted). + it creates a temporary table for sorting. I don't want to rely on the fact that it returns a sorted set of rows even without the ORDER BY clause, since the result is used for pagination.

Is there a way to help the MySQL understand that no sorting is required? Or am I missing something? Thank you!

Tried forcing indexes and using subselects, it didn't help


Solution

  • MySQL always reads tables via some index (assuming InnoDB, the default storage engine).

    If the order you request in your ORDER BY clause is the same as the order of the index the optimizer chooses, then there will be no Using filesort needed.

    But this only works for the first joined table.

    If you request an order by some column from any table after the first joined table, then MySQL must do a filesort, because it can't assume it will read rows sequentially, even though it uses an index to look up the rows.

    Note that the first joined table isn't necessarily the first table you name in your query. The optimizer can reorder tables in some cases. The real first joined table is the one listed on the first line of the EXPLAIN report.