Search code examples
mysqldatabase-partitioning

MySQL and correctly setting up partitions


Our customers are all part of the same database, where we store all of their images, blog posts, etc. I want to partition the tables to optimize performance for each and every one of their sites (specified by "site_id" in every table).

As an example, let's use our images table. Available columns are: image_id (int, primary, auto_inc), site_id (int, primary), width (int), height (int), filesize (int), uploaded (datetime).

I've tried just adding partitioning; ALTER TABLE images PARTITION BY KEY(image_id,site_id), which worked. However, I have no way of knowing what MySQL uses to "group" the partitions - is it optimized for what I want? That's where you guys come in.

Question 1

Is PARTITION BY KEY the correct way to set up a partition to get the site_id "separated" from each other? If not - how would you do it?

Question 2

I've seen that people are manually adding the partitions - p0, p1, p2, p3, etc. Do you need to specify the number of partitions, or does MySQL "understand" that I'm after the site_id?


Solution

  • Question 1

    If you use partition by key(image_id,site_id), you cannot be sure that each partition will only contain separated site_id's because this kind of partitionning will use a integrated hashing function on (image_id,site_id) and the result of this will determinate in wich partition the row will be inserted into.

    If you want to ensure separation you should use RANGE or LIST partitionning.

    Question 2

    If using RANGE or LIST partitionning, you will have to define the number of partitions you want. ie :

    PARTITION BY RANGE (site_id) (
    PARTITION p0 VALUES LESS THAN (6),
    PARTITION p1 VALUES LESS THAN (11),
    PARTITION p2 VALUES LESS THAN (16),
    PARTITION p3 VALUES LESS THAN (MAXVALUE)
    );
    

    LIST and RANGE requires a bit of maintenance. If new site_id are added / removed, you will have to adapt your partition scheme.

    KEY partitioning will ensure a balanced row repartition across the number of partitions specified:

    PARTITION BY KEY(image_id,site_id)
    PARTITIONS 10;
    

    Hope it helps.