Search code examples
mysqldatabase-partitioning

Mysql query with partition taking more time than without partitioning


I have a table with 300k rows. The table a quite heavy so it make slow every query. After trying a lot of index and other optimization I decided to create partitions on the table.

Now I have 3 version of table

  1. e_update
  2. e_update_partition(20 partition using HASH(on event_id))
  3. e_update_partition_event(12 partition with range 25K entries in every partition (on event_id) )

Now I am running same query on every table one by one and comparing timings

SELECT eu.event_id
FROM e_update eu
INNER JOIN event e ON e.id=eu.event_id
WHERE eu.start_date > 2010-10-15
  AND e.published=1
  AND eu.event_id > 25000
  AND eu.event_id < 50000;

Time taken - 189911 rows in set, 2 warnings (14.43 sec)

SELECT eu.event_id
FROM e_update_partition eu
INNER JOIN event e ON e.id=eu.event_id
WHERE eu.start_date > 2010-10-15
  AND e.published=1
  AND eu.event_id > 25000
  AND eu.event_id < 50000;

Time taken - 189911 rows in set, 2 warnings (15.87 sec)

Explain result-

+----+-------------+-------+-----------------------------------------------------------------------+-------+--------------------------------+-----------+---------+--------------------+--------+-----------------------+
| id | select_type | table | partitions                                                            | type  | possible_keys                  | key       | key_len | ref                | rows   | Extra                 |
+----+-------------+-------+-----------------------------------------------------------------------+-------+--------------------------------+-----------+---------+--------------------+--------+-----------------------+
|  1 | SIMPLE      | e     | NULL                                                                  | range | PRIMARY,published              | published | 6       | NULL               | 120674 | Using index condition |
|  1 | SIMPLE      | eu    | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19 | ref   | event_id,start_date,event_id_2 | event_id  | 4       | biztradeshows.e.id |      1 | Using where           |
+----+-------------+-------+-----------------------------------------------------------------------+-------+--------------------------------+-----------+---------+--------------------+--------+-----------------------+

SELECT eu.event_id
FROM e_update_partition_event eu
INNER JOIN event e ON e.id=eu.event_id
WHERE eu.start_date > 2010-10-15
  AND e.published=1
  AND eu.event_id > 25000
  AND eu.event_id < 50000;

Time taken - 189911 rows in set, 2 warnings (20.56 sec)

Explain result-

+----+-------------+-------+----------------------------------+--------+--------------------------------+-----------+---------+--------------------+--------+-----------------------+
| id | select_type | table | partitions                       | type   | possible_keys                  | key       | key_len | ref                | rows   | Extra                 |
+----+-------------+-------+----------------------------------+--------+--------------------------------+-----------+---------+--------------------+--------+-----------------------+
|  1 | SIMPLE      | e     | NULL                             | range  | PRIMARY,published              | published | 6       | NULL               | 120674 | Using index condition |
|  1 | SIMPLE      | eu    | p3,p4,p5,p6,p7,p8,p9,p10,p11,p12 | eq_ref | event_id,start_date,event_id_2 | event_id  | 4       | biztradeshows.e.id |      1 | Using where           |
+----+-------------+-------+----------------------------------+--------+--------------------------------+-----------+---------+--------------------+--------+-----------------------+

Partition schema for 3rd query

(PARTITION p1 VALUES LESS THAN (25000) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (50000) ENGINE = InnoDB,
PARTITION p3 VALUES LESS THAN (75000) ENGINE = InnoDB,
PARTITION p4 VALUES LESS THAN (100000) ENGINE = InnoDB,
PARTITION p5 VALUES LESS THAN (125000) ENGINE = InnoDB,
PARTITION p6 VALUES LESS THAN (150000) ENGINE = InnoDB,
PARTITION p7 VALUES LESS THAN (175000) ENGINE = InnoDB,
PARTITION p8 VALUES LESS THAN (200000) ENGINE = InnoDB,
PARTITION p9 VALUES LESS THAN (225000) ENGINE = InnoDB,
PARTITION p10 VALUES LESS THAN (250000) ENGINE = InnoDB,
PARTITION p11 VALUES LESS THAN (275000) ENGINE = InnoDB,
PARTITION p12 VALUES LESS THAN (300000) ENGINE = InnoDB)

Why my 3rd query taking more time than other two Queries and using almost all the partitions ?


Solution

  • No amount of partitioning is going to help you with this one:

    e.published=1 
    

    boolean fields cannot be indexed effectively. Why? because they have only one of two values. This looks like a mutable field (one that you do updates on, since published will probably be flipped on and off). Such a field cannot be used in partitioning either.

    Your first option is to combine this published field with another field and create a composite index and hope that it has sufficient cardinality to be a usefull index.

    Your second option is to create an archive table and move unpublished items out to the archive table.

    BTW, your query has a condition that doesn't make much sense:

     and eu.event_id >25000 and eu.event_id>50000;
    

    This can be shortened to

     and eu.event_id > 50000;
    

    Update

    Why are all the partitions being queried? Well your first partition scheme is on a hash partitioning

    Partitioning by HASH is used primarily to ensure an even distribution of data among a predetermined number of partitions.

    So your data is in all the partitions

    The second scheme, if you look closely you will find that two of the partitions are not being used. And those are the partitions left out by your where clause.

    So the problem is in your Where clause :-)