Search code examples
mysqlsql-deletedatabase-partitioning

Delete optimization using mysql partitioned table (MySQL 5.1)


I have a log table that gets processed every night. Processing will be done on data that was logged yesterday. Once the processing is complete I want to delete the data for that day. At the same time, I have new data coming into the table for the current day. I partitioned the table based on day of week. My hope was that I could delete data and insert data at the same time without contention. There could be as many as 3 million rows of data a day being processed. I have searched for information but haven't found anything to confirm my assumption.
I don't want to have the hassles of writing a job that adds partitions and drop partitions as I have seen in other examples. I was hoping to implement a solution using seven partions. eg.

    CREATE TABLE `professional_scoring_log` (
      `professional_id` int(11) NOT NULL,
      `score_date` date NOT NULL,
      `scoring_category_attribute_id` int(11) NOT NULL,
      `displayable_score` decimal(7,3) NOT NULL,
      `created_at` datetime NOT NULL,
      PRIMARY KEY (`professional_id`,`score_date`,`scoring_category_attribute_id`),
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    /*!50100 PARTITION BY RANGE (DAYOFWEEK(`score_date`))
    (PARTITION Sun VALUES LESS THAN (2) ENGINE = InnoDB,
     PARTITION Mon VALUES LESS THAN (3) ENGINE = InnoDB,
     PARTITION Tue VALUES LESS THAN (4) ENGINE = InnoDB,
     PARTITION Wed VALUES LESS THAN (5) ENGINE = InnoDB,
     PARTITION Thu VALUES LESS THAN (6) ENGINE = InnoDB,
     PARTITION Fri VALUES LESS THAN (7) ENGINE = InnoDB,
     PARTITION Sat VALUES LESS THAN (8) ENGINE = InnoDB) */

When my job that processes yesterday's data is complete, it would delete all records where score_date = current_date-1. At any one time, I am likely only going to have data in one or two partitions, depending on time of day.

Are there any holes in my assumptions?


Solution

  • Charlie, I don't see any holes in your logic/assumptions.

    I guess my one comment would be why not use the drop/add partition syntax? It has to be more efficient than DELETE FROM .. Where ..; and it's just two calls - no big deal -- store "prototype" statements and substitute for "Sun" and "2" as required for each day of the week -- I often use sprintf for doing just that

    ALTER TABLE `professional_scoring_log` DROP PARTITION Sun;
    
    
    ALTER TABLE `professional_scoring_log` ADD PARTITION (
        PARTITION Sun VALUES LESS THAN (2)
    );