Search code examples
mysqlinnodbsql-optimization

Not efficient execution plan taken by MySQL InnoDB


I have trouble to optimize a request with the MySQL InnoDB optimizer. The following query (query 1) runs efficiently:

explain select * from ah_problems
where rnid in (6022342, 6256614, 5842714, 6302489)
and fieldid in (5,6);

and the plan (plan 1) is as follows:

id select_type table       type  possible_keys                   key           key_len ref rows Extra
=  ======      =========== ===== =============================== ============= ======= === ==== =====
1  SIMPLE      ah_problems range CONSTRAINTFIELDID,RNID__FIELDID RNID__FIELDID 8           33   Using where

So far, so good.

Whereas the slightly modified query (query 2) below will take a catastrophic execution plan:

explain select * from ah_problems
where rnid in (select rec.rnid as record_id from ar_records rec where rnid in (6022342, 6256614, 5842714, 6302489))
and fieldid in (5, 6)

The result is the same, but the plan (plan 2) is now doing this:

id select_type        table       type            possible_keys      key      key_len ref  rows     Extra
=  ======             =========== =====           ================== ======== ======= ==== =======  =====
1  PRIMARY            ah_problems ALL             CONSTRAINTFIELDID                        36177754 Using where
2  DEPENDENT SUBQUERY rec         unique_subquery PRIMARY            PRIMARY  4       func 1        Using index; Using where

If you wonder, that new sub-query...

select rec.rnid as record_id from ar_records rec where rnid in (6022342, 6256614, 5842714, 6302489)

...does nothing more than returning the four rows that were hard-coded in query 1:

6022342
6256614
5842714
6302489

so queries (1) and (2) are equivalent.

Guess what, I need query 2, and not one. And I want query 2 to be as efficient as query 1. I tried the following:

  1. Query 3: Add FORCE INDEX(RNID_FIELDID) to query 2. MySQL simply ignores it.

    explain select * from ah_problems force index (rnid__fieldid) where rnid in (select rec.rnid as record_id from ar_records rec where rnid in (6022342, 6256614, 5842714, 6302489)) and fieldid in (5,6)

The execution plan is the same as plan 2.

  1. Query 4: Add an ORDER BY RNID, FIELDID to query 3. I saw on some other questions that this might trick the optimizer. It doesn't help.

    explain select * from ah_problems force index (rnid__fieldid) where rnid in (select rec.rnid as record_id from ar_records rec where rnid in (6022342, 6256614, 5842714, 6302489)) and fieldid in (5, 6) order by rnid, fieldid

The plan 4 is now using the index, but the row count is still catastrophic:

id select_type        table       type            possible_keys      key           key_len ref  rows      Extra
=  ======             =========== =====           ================== ========      ======= ==== =======   =====
1  PRIMARY            ah_problems index                              RNID__FIELDID 8             36179307 Using where
2  DEPENDENT SUBQUERY rec         unique_subquery PRIMARY            PRIMARY       4       func  1        Using index; Using where

If this helps, this is the definition of my ah_problems tables. I'm unfortunately not able to change the definition of the table. Is there anything I can do to make MySQL optimizer use plan 1 to attack table ah_problems in query 2?

CREATE TABLE `ah_problems` (
  `ID` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Identifier for update statements',
  `RNID` int(11) NOT NULL COMMENT 'Record number',
  `FIELDID` int(11) NOT NULL COMMENT 'Which field is value in',
  `VALUE` varchar(255) NOT NULL COMMENT 'The value the field got on MODIFIED_DATE',
  `PREVIOUSID` int(11) DEFAULT NULL COMMENT 'Reference to previous value',
  `MODIFIED_DATE` datetime NOT NULL COMMENT 'When was it changed',
  `MODIFIED_GROUPID` int(11) DEFAULT NULL COMMENT 'In what group did modified_userid change it',
  `MODIFIED_USERID` int(11) NOT NULL COMMENT 'Who changed it',
  PRIMARY KEY (`ID`),
  KEY `CONSTRAINTFIELDID` (`FIELDID`),
  KEY `CONSTRAINTMODIFIED_GROUPID` (`MODIFIED_GROUPID`),
  KEY `CONSTRAINTMODIFIED_USERID` (`MODIFIED_USERID`),
  KEY `CONSTRAINTPREVIOUSID` (`PREVIOUSID`),
  KEY `RNID__FIELDID` (`RNID`,`FIELDID`),
  CONSTRAINT `HPRB_FIELD` FOREIGN KEY (`FIELDID`) REFERENCES `ad_fields` (`ID`),
  CONSTRAINT `HPRB_MODIFIED_GROUP` FOREIGN KEY (`MODIFIED_GROUPID`) REFERENCES `ap_groups` (`ID`),
  CONSTRAINT `HPRB_MODIFIED_USER` FOREIGN KEY (`MODIFIED_USERID`) REFERENCES `ap_users` (`ID`),
  CONSTRAINT `HPRB_PREVIOUS` FOREIGN KEY (`PREVIOUSID`) REFERENCES `ah_problems` (`ID`) ON DELETE CASCADE,
  CONSTRAINT `HPRB_RN` FOREIGN KEY (`RNID`) REFERENCES `ar_records` (`RNID`)
) ENGINE=InnoDB AUTO_INCREMENT=72305308 DEFAULT CHARSET=utf8 COMMENT='PTR history'$$

Solution

  • MySQL cannot optimize the IN subquery to be leading (executed only once), it's always executed for each record in the main query in a loop.

    Replace it with a join:

    SELECT ahp.*
    FROM   ar_records ar
    JOIN   ah_problems ahp
    ON     ahp.rnid = ar.rnid
           AND ahp.fieldId IN (5, 6)
    WHERE  ar.rnid IN (6022342, 6256614, 5842714, 6302489)