Search code examples
mysqloptimizationdatabase-partitioning

MySQL partition selection not optimized correctly


I have a table 'fvs_data' with a timestamp column that I have used for a partition. The query optimizer doesn't seem to be selecting the correct partitions if I have a complex query. For example, this query

EXPLAIN PARTITIONS SELECT * FROM `fvs_data` WHERE `timestamp` = '2011-11-02'

uses the correct partition, as show in the partitions list: p_2011_44, since TO_DAYS('2011-11-02')=734808.

But if add another condition that should return the exact same partition, then it wants to check one of the other partitions

EXPLAIN PARTITIONS SELECT * FROM `fvs_data` WHERE `timestamp` > '2011-11-01' AND `timestamp` < '2011-11-03'

Which returns p_2011_42,p_2011_44 for the partition list. Here is my CREATE TABLE syntax

CREATE TABLE `fvs_data` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `fvs_client_id` int(11) NOT NULL,
 `timestamp` datetime NOT NULL,
 `setupid` char(20) NOT NULL,
 `assyline` char(20) NOT NULL,
 `machine` char(20) NOT NULL,
 `side` char(20) NOT NULL,
 `module` char(20) NOT NULL,
 `fixtureid` char(20) NOT NULL,
 `fixturepos` char(20) NOT NULL,
 `feedpos` char(20) NOT NULL,
 `partnum` char(20) NOT NULL,
 `vendor` char(20) NOT NULL,
 `tid` char(20) NOT NULL,
 `quant` char(20) NOT NULL,
 `status` char(32) NOT NULL,
 `oper` char(20) NOT NULL,
 `lane1` char(20) NOT NULL,
 `lane2` char(20) NOT NULL,
 `sn` char(20) NOT NULL,
 `totalcomp` char(20) NOT NULL,
 PRIMARY KEY (`id`,`timestamp`),
 KEY `fvs_client_id` (`fvs_client_id`),
 KEY `setupid` (`setupid`),
 KEY `assyline` (`assyline`),
 KEY `machine` (`machine`),
 KEY `side` (`side`),
 KEY `module` (`module`),
 KEY `fixtureid` (`fixtureid`),
 KEY `fixturepos` (`fixturepos`),
 KEY `feedpos` (`feedpos`),
 KEY `partnum` (`partnum`),
 KEY `vendor` (`vendor`),
 KEY `tid` (`tid`),
 KEY `status` (`status`),
 KEY `oper` (`oper`),
 KEY `lane1` (`lane1`),
 KEY `lane2` (`lane2`),
 KEY `sn` (`sn`)
) ENGINE=MyISAM AUTO_INCREMENT=36032 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE ( TO_DAYS(timestamp))
(PARTITION p_2011_42 VALUES LESS THAN (734796) ENGINE = MyISAM,
PARTITION p_2011_43 VALUES LESS THAN (734803) ENGINE = MyISAM,
PARTITION p_2011_44 VALUES LESS THAN (734810) ENGINE = MyISAM,
PARTITION p_2011_45 VALUES LESS THAN (734817) ENGINE = MyISAM) */

I'm trying to do weekly partitions, but not based on real weeks starting with Sunday, I just divided the day of the year by 7 and used that to create a partition name.


Solution

  • The partition pruning will always check the first partition, regardless of the values you are searching for. This has to do with the way partitioning stores NULLs in the first partition.

    To workaround this and optimize range partitioning, create an initial partition for values less than any you would normally store:

    ...
    PARTITION BY RANGE ( TO_DAYS(timestamp))
    (
    PARTITION p_0000 VALUES LESS THAN (0) ENGINE = MyISAM, -- NULLs end up here
    PARTITION p_2011_42 VALUES LESS THAN (734796) ENGINE = MyISAM,
    PARTITION p_2011_43 VALUES LESS THAN (734803) ENGINE = MyISAM,
    PARTITION p_2011_44 VALUES LESS THAN (734810) ENGINE = MyISAM,
    PARTITION p_2011_45 VALUES LESS THAN (734817) ENGINE = MyISAM
    PARTITION p_MAX VALUES LESS THAN MAXVALUE ENGINE = MyISAM
    )
    

    Of course your timestamp column is NOT NULL so that first partition will be empty, and therefore very quick to search.

    P.S.: You should also add a partition to catch any large values outside the range of any other partition. Otherwise inserting such rows will fail.