Search code examples
mysqlsqldatabasequery-optimization

MySQL ORDER BY using filesort (2 joined tables)


I have strange problem on query optimization. SQL is generated by ORM-like library and something bad has been detected only after reading megabytes of SQL logs.

SELECT  
  `ct_pricelistentry`.`uid` as `uid`, `ct_pricelistentry`.`skuGroup` as `skuGroup`
FROM 
  `ct_pricelistentry` INNER JOIN `lct_set` 
ON `lct_set`.`parent_uid`='SET:ALLPRICELISTENTRIES' AND 
   `lct_set`.`ref_uid`=`ct_pricelistentry`.`uid` 
WHERE 
  (`isGroup` IS FALSE) AND 
  (`isService` IS FALSE) AND 
  (`brand` = 'BRAND:5513f43697d637.00632331' OR `brand` IS NULL) 
ORDER BY `skuGroup` ASC

EXPLAIN says:

'1', 'SIMPLE', 'ct_pricelistentry', 'ALL', 'PRIMARY', NULL, NULL, NULL, '34591', 'Using where; Using filesort'

'1', 'SIMPLE', 'lct_set', 'eq_ref', 'PRIMARY', 'PRIMARY', '292', 'const,dealers_v2.ct_pricelistentry.uid', '1', 'Using where; Using index'

Note: all needed indexes are presented including skuGroup. But the index skuGroupstill is not listed in EXPLAIN possible_keys. It also cannot be forced by FORCE INDEX (it just disables all indexes).

After some research I found hacky solution but not sure it works as indended:

  1. Add FORCE INDEX (skuGroup),
  2. Add to the WHERE clause dummy AND (skuGroup IS NULL OR skuGroup IS NOT NULL) part.

Following query

SELECT  
  `ct_pricelistentry`.`uid` as `uid`, `ct_pricelistentry`.`skuGroup` as `skuGroup`
FROM 
  `ct_pricelistentry` FORCE INDEX (`skuGroup`) INNER JOIN `lct_set` 
ON `lct_set`.`parent_uid`='SET:ALLPRICELISTENTRIES' AND
   `lct_set`.`ref_uid`=`ct_pricelistentry`.`uid` 
WHERE 
  (`isGroup` IS FALSE) AND 
  (`isService` IS FALSE) AND 
  (`brand` = 'BRAND:5513f43697d637.00632331' OR `brand` IS NULL) AND
  (`skuGroup` IS NULL OR `skuGroup` IS NOT NULL)
ORDER BY `skuGroup` ASC

gives EXPLAIN without filesort so it seems to use the index to fetch ordered rows:

'1', 'SIMPLE', 'ct_pricelistentry', 'range', 'skuGroup', 'skuGroup', '768', NULL, '16911', 'Using where'

'1', 'SIMPLE', 'lct_set', 'eq_ref', 'PRIMARY', 'PRIMARY', '292', 'const,dealers_v2.ct_pricelistentry.uid', '1', 'Using where; Using index'

Whats happening at all? Is it a MySQL bug? I've tested on MySQL 5.1 - 5.5 - the same results. Do you have more predictable/stable solutions?

---- CREATE TABLE ----
CREATE TABLE IF NOT EXISTS `lct_set` (
  `parent_uid` varchar(48) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `ref_uid` varchar(48) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`parent_uid`,`ref_uid`),
  UNIQUE KEY `BACK_PRIMARY` (`ref_uid`,`parent_uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `ct_pricelistentry` (
  `uid` varchar(48) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `refcount` int(11) NOT NULL,
  `isDisposed` tinyint(1) DEFAULT NULL,
  `tag` text,
  `isGroup` tinyint(1) DEFAULT NULL,
  `parentEntry` varchar(48) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `externalUID` varchar(255) DEFAULT NULL,
  `productCode` varchar(16) DEFAULT NULL,
  `name` varchar(255) DEFAULT NULL,
  `sku` varchar(255) DEFAULT NULL,
  `skuGroup` varchar(255) DEFAULT NULL,
  `measureUnit` varchar(16) DEFAULT NULL,
  `image` varchar(48) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `itemClassExternalUID` varchar(255) DEFAULT NULL,
  `itemClassName` varchar(255) DEFAULT NULL,
  `itemClassDescription` text,
  `itemClassComments` text,
  `itemClassAttachments` varchar(48) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `brand` varchar(48) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `priceGroups` text,
  `productAttributes` text,
  `constituents` text,
  `position` int(11) DEFAULT NULL,
  `isService` tinyint(1) DEFAULT NULL,
  `stackability` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`uid`),
  UNIQUE KEY `test1` (`uid`,`skuGroup`),
  KEY `name` (`name`),
  KEY `sku` (`sku`),
  KEY `itemClassExternalUID` (`itemClassExternalUID`),
  KEY `parentEntry` (`parentEntry`),
  KEY `position` (`position`),
  KEY `externalUID` (`externalUID`),
  KEY `productCode` (`productCode`),
  KEY `skuGroup` (`skuGroup`),
  KEY `brand` (`brand`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Solution

  • The fix Using the INDEX(skuGroup) avoids the filesort, but prevents any useful filtering. Optimizing the filtering is more important than avoiding the filesort.

    Remove the FORCE and add this 'composite' index

    INDEX(isGroup, isService, brand) -- (in any order)
    

    It should help, but probably won't prevent "using filesort". The OR is the killer.

    To prevent the use of filesort for ORDER BY, you need a single (usually composite) index that includes all of the WHERE clause, plus the ORDER BY column(s). In constructing such an index, the only things that can be handled in the WHERE are and'd together '=' clauses. Anything else (such as your OR) prevents the optimization.

    Why OR hurts Think of it this way... Suppose there were a long printed list of names sorted by last name + first name. And the query asked for WHERE last = 'Karakulov' ORDER BY first. You would jump to the first Karakulov, and there would be all the first names in order. Now suppose you wanted WHERE (last = 'Karakulov' OR last = 'James') ORDER BY first. You could get all your relatives and all my relatives, but you still need to shuffle them together to do the ORDER BY first. MySQL has one technique for that: filesort (and a tmp table leading up to it.)

    As a consolation, filesort's temp table is usually an in-memory MEMORY table, so it is reasonably fast.

    A workaround is sometimes to turn OR into UNION. (It probably would not help for your query.)

    Some schema critique, and other notes...

    The UNIQUE key is useless because the PRIMARY KEY already declares uid to be "unique".

    VARCHAR(48) utf8 is a rather clumsily big key. Is it some form of UUID? If so, I have nasty things to say about randomness and charset and size.

    Some uids are (48), some are (255); was this deliberate?

    Get rid of (skuGroupIS NULL ORskuGroupIS NOT NULL) -- The Optimizer is probably not smart enough to realize that this is always "TRUE" !

    FORCE INDEX may work today, but could backfire tomorrow. Get rid of it.

    What is the value of innodb_buffer_pool_size? It should be about 70% of available RAM if you have at least 4GB of ram. If you left it at some low default, then you are probably I/O-bound, hence slow.

    Please provide SHOW CREATE TABLE lct_set -- something strange is going on in the JOIN.