Search code examples
mysqlsqlquery-optimization

Optimize query with functions in `group by` key?


I am using MySQL 8.0 and there is a slow query on a large table to be optimized.

The table contains 11 million rows of data and it's structure:

CREATE TABLE `ccu` (
  `id` bigint NOT NULL,
  `app_id` int NOT NULL,
  `ccu` int NOT NULL,
  `audit_create` datetime NOT NULL,
  PRIMARY KEY (`id`) USING BTREE,
  UNIQUE KEY `ccu_game_create_time_2a10bc69_idx` (`app_id`,`audit_create`) USING BTREE,
  KEY `ccu_audit_create_idx` (`audit_create`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

My query is:

SELECT app_id, DATE(audit_create) cal_day, MAX(ccu) pcu, ROUND(AVG(ccu)) id_acu 
FROM ccu
WHERE audit_create BETWEEN DATE_SUB(DATE(NOW()), INTERVAL 29 DAY) AND DATE(NOW())
GROUP BY app_id, DATE(audit_create)

The query runs over 2 seconds. I add the condition by between ... and ... to filter useful data. However, the data stored in audit_create is in format yyyy-MM-dd HH:mm:ss, I have to use the date function but according to the execution plan only the where condition uses index(still has temporary table), the group by clause does not use any index at all. enter image description here

I have no right to alter the table structre to add a date column. Is it possible to optimize the query to lower the query time?


Solution

  • I was able to eliminate the Using temporary by adding an expression index:

    mysql> alter table ccu add key bk1 (app_id, (cast(audit_create as date)));
    Query OK, 0 rows affected (0.02 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    mysql> explain SELECT app_id, DATE(audit_create) cal_day, 
       MAX(ccu) pcu, ROUND(AVG(ccu)) id_acu  
     FROM ccu 
     WHERE date(audit_create) BETWEEN DATE_SUB(DATE(NOW()), INTERVAL 29 DAY) AND DATE(NOW()) 
     GROUP BY app_id, cast(audit_create as date)\G 
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: ccu
       partitions: NULL
             type: index
    possible_keys: bk1
              key: bk1
          key_len: 8
              ref: NULL
             rows: 1
         filtered: 100.00
            Extra: Using where
    

    Unfortunately, that EXPLAIN report shows it will use type: index which is an index-scan, in other words it will examine every one of the 11 million index entries. It could make it worse than it was in your query.

    The only other suggestion I have is to run this query once a day and store the result in a summary table. Running a 2-second query once a day so you can get the aggregate results quickly should be acceptable. But you said you don't have authority to add a column, so I guess you don't have authority to add a table either.

    In that case, get a faster computer with more RAM.