Search code examples
mysqlmariadbexplain

Why does Mysql query get slow when I add a column to return?


I have a query that is running slowly (Both in Mysql 8 and MariaDB 10):

    select SQL_NO_CACHE UNIX_TIMESTAMP(created_at)* 1000 AS x,
           value AS y, day(created_at), month(created_at), year(created_at) 
    from `portfolios` 
    where `user_id` = 3 and (created_at)>( CURDATE() - INTERVAL 1 MONTH) 
    group by day(created_at),month(created_at), year(created_at) 
    order by `created_at` asc;

If I remove the 'value' column from the select, it runs fast.

The table is:

portfolios | CREATE TABLE `portfolios` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(10) unsigned NOT NULL,
  `value` decimal(15,8) NOT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `portfolios_user_id_created_at_index` (`user_id`,`created_at`),
  KEY `portfolios_user_id_updated_at_index` (`user_id`,`updated_at`),
  KEY `portfolios_id_user_id_updated_at_index` (`id`,`user_id`,`updated_at`),
  KEY `portfolios_user_id_index` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=11767164 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

the table has about 12 million rows.

Running explain (without the 'value' column):

| id   | select_type | table      | type  | possible_keys                                                                                    | key                                 | key_len | ref  | rows   | Extra                                                     |
+------+-------------+------------+-------+--------------------------------------------------------------------------------------------------+-------------------------------------+---------+------+--------+-----------------------------------------------------------+
|    1 | SIMPLE      | portfolios | range | portfolios_user_id_created_at_index,portfolios_user_id_updated_at_index,portfolios_user_id_index | portfolios_user_id_created_at_index | 9       | NULL | 183996 | Using where; Using index; Using temporary; Using filesort |

Running explain (with the 'value' column):

| id   | select_type | table      | type  | possible_keys                                                                                    | key                                 | key_len | ref  | rows   | Extra                                                  |
+------+-------------+------------+-------+--------------------------------------------------------------------------------------------------+-------------------------------------+---------+------+--------+--------------------------------------------------------+
|    1 | SIMPLE      | portfolios | range | portfolios_user_id_created_at_index,portfolios_user_id_updated_at_index,portfolios_user_id_index | portfolios_user_id_created_at_index | 9       | NULL | 184038 | Using index condition; Using temporary; Using filesort |

So when I add this column to the query it seems to stop 'using where, using index' and start 'using index condition'.

I don't really understand why simply adding a column to the select is causing this to slow down (15 seconds compared to about .3) - is it due to the size of the table?

I'd appreciate any advice on how to get this running faster!

Thanks in advance.


Solution

  • Your query involves three columns: user_id, created_at and value.

    You have a few indexes on your table but none of them contains the value column. The explain seems to indicate that MySQL is using the (user_id, updated_at) index to filter matching rows but in order to generate the complete result it also scans the table to extract values from value column for matching rows.

    If you create an index on (user_id, created_at, value) it could be used as a covering index... to match the where clause and and to lookup value from the index. MySQL no longer needs to touch the 22mil rows table.


    Also, your GROUP BY is incorrect. I am guessing you need to count or sum the values for each day of year in which case you need to:

    SELECT UNIX_TIMESTAMP(CAST(created_at AS DATE)) * 1000, SUM(values)
    FROM t
    WHERE ...
    GROUP BY CAST(created_at AS DATE)
    ORDER BY CAST(created_at AS DATE)