Search code examples
mysqlquery-optimization

Optimize MySQL query with range filter on related table


So I have a news_news table (~16k rows), where every news can have many teams assigned to it (~33k associations).

CREATE TABLE `news_news` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `body` longtext COLLATE utf8_unicode_ci NOT NULL,
  `location` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `date` date NOT NULL DEFAULT '0000-00-00',
  `status` char(1) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `slug` varchar(100) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `announcement` tinyint(1) NOT NULL DEFAULT '0',
  `author_id` int(11) NOT NULL DEFAULT '0',
  `priority` int(10) unsigned NOT NULL,
  `embed` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `source_url` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `source_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `site_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `news_news_slug` (`slug`),
  KEY `news_news_author_id` (`author_id`),
  KEY `news_news_site_id` (`site_id`)
) ENGINE=InnoDB AUTO_INCREMENT=16521 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci


CREATE TABLE `news_news_teams` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `news_id` int(10) unsigned NOT NULL,
  `team_id` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `news_id` (`news_id`,`team_id`),
  KEY `news_news_teams_news_id` (`news_id`),
  KEY `news_news_teams_team_id` (`team_id`)
) ENGINE=InnoDB AUTO_INCREMENT=40393 DEFAULT CHARSET=latin1;

I want to get last 5 news which were tagged from some set of teams;

SELECT 
    `news_news`.`id`, `news_news`.`title`, `news_news`.`body`, `news_news`.`embed`,
    `news_news`.`location`, `news_news`.`date`, `news_news`.`priority`, `news_news`.`status`, 
    `news_news`.`slug`, `news_news`.`announcement`, `news_news`.`author_id`, `news_news`.`source_name`, 
    `news_news`.`source_url`, `news_news`.`site_id` 
FROM `news_news` 
INNER JOIN `news_news_teams` ON ( `news_news`.`id` = `news_news_teams`.`news_id` ) 
WHERE (
    `news_news`.`status` = 'P' 
    AND `news_news_teams`.`team_id` IN (17, 18, 21, 27, 28, 31, 32, 34, 44, 51, 53, 
        56, 60, 65, 73, 75, 77, 80, 82, 83, 87, 89, 90, 92, 95, 98, 102, 105, 106,
        110, 112, 116, 117, 120, 124, 127, 128, 130, 134, 138, 141, 146, 147, 152,
        154, 156, 158, 161, 165, 166, 169, 170, 174, 176, 179, 181, 184, 185, 188,
        194, 196, 203, 214, 220, 221, 228, 229, 230, 234, 235, 240, 245, 246, 249,
        250, 251, 252, 257, 258, 260, 264, 266, 272, 273, 275, 276, 279, 280, 281,
        283, 284, 285, 287, 294, 296, 297, 318, 319, 320, 326, 327, 330, 332, 334,
        335, 336, 337, 350, 351, 368, 369, 373, 374, 375, 376, 377, 378, 383, 390,
        393, 394, 395, 397, 398, 399, 400, 401, 405, 408, 410, 413, 416, 417, 418,
        421, 425, 426, 427, 431, 434, 439, 440, 441, 443, 448, 451, 452, 453, 454,
        457, 462, 477, 479, 482, 483, 484, 489, 491, 548, 566, 567, 10001792, 10007454))
ORDER BY `news_news`.`date` DESC 
LIMIT 5;

Here is the explain from that query.

+----+-------------+-----------------+--------+---------------------------------------------------------+-------------------------+---------+----------------------------+------+--------------------------------------------------------+
| id | select_type | table           | type   | possible_keys                                           | key                     | key_len | ref                        | rows | Extra                                                  |
+----+-------------+-----------------+--------+---------------------------------------------------------+-------------------------+---------+----------------------------+------+--------------------------------------------------------+
|  1 | SIMPLE      | news_news_teams | range  | news_id,news_news_teams_news_id,news_news_teams_team_id | news_news_teams_team_id | 4       | NULL                       | 4075 | Using index condition; Using temporary; Using filesort |
|  1 | SIMPLE      | news_news       | eq_ref | PRIMARY                                                 | PRIMARY                 | 4       | cs.news_news_teams.news_id |    1 | Using where                                            |
+----+-------------+-----------------+--------+---------------------------------------------------------+-------------------------+---------+----------------------------+------+--------------------------------------------------------+

It took 2+ seconds to be executed, is there a way to optimize it?


Solution

  • Adding index on news_news.date as @Octopi suggested is a good idea (since we order by that field), but in this case it wasn't enough to convince mysql to use the date index. The solution that worked for me was to add one more filter in the WHERE, telling mysql to search only in the news added last month. This increase the query speed around 5-7 times, now it executes for about 250ms.

    SELECT 
        DISTINCT `news_news`.`id`, `news_news`.`title`, `news_news`.`body`, `news_news`.`embed`, 
            `news_news`.`location`, `news_news`.`date`, `news_news`.`priority`, `news_news`.`status`, 
            `news_news`.`slug`, `news_news`.`announcement`, `news_news`.`author_id`, 
            `news_news`.`source_name`, `news_news`.`source_url`, `news_news`.`site_id` 
    FROM `news_news` 
    INNER JOIN `news_news_teams` ON ( `news_news`.`id` = `news_news_teams`.`news_id` ) 
    WHERE (
        `news_news`.`status` = 'P' 
        AND `news_news`.`date` >= DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH)
        AND `news_news_teams`.`team_id` IN (17, 18, 21, 27, 28, 31, 32, 34, 44, 51, 53, 56, 60, 
            65, 73, 75, 77, 80, 82, 83, 87, 89, 90, 92, 95, 98, 102, 105, 106, 110, 112, 116, 117, 
            120, 124, 127, 128, 130, 134, 138, 141, 146, 147, 152, 154, 156, 158, 161, 165, 166, 
            169, 170, 174, 176, 179, 181, 184, 185, 188, 194, 196, 203, 214, 220, 221, 228, 229, 
            230, 234, 235, 240, 245, 246, 249, 250, 251, 252, 257, 258, 260, 264, 266, 272, 273, 
            275, 276, 279, 280, 281, 283, 284, 285, 287, 294, 296, 297, 318, 319, 320, 326, 327, 
            330, 332, 334, 335, 336, 337, 350, 351, 368, 369, 373, 374, 375, 376, 377, 378, 383, 
            390, 393, 394, 395, 397, 398, 399, 400, 401, 405, 408, 410, 413, 416, 417, 418, 421, 
            425, 426, 427, 431, 434, 439, 440, 441, 443, 448, 451, 452, 453, 454, 457, 462, 477, 
            479, 482, 483, 484, 489, 491, 548, 566, 567, 10001792, 10007454)
    ) ORDER BY `news_news`.`date` DESC 
    LIMIT 5;
    
    //and the explain
    +----+-------------+-----------------+-------+---------------------------------------------------------+----------------+---------+-----------------+------+-----------------------------------------------------+
    | id | select_type | table           | type  | possible_keys                                           | key            | key_len | ref             | rows | Extra                                               |
    +----+-------------+-----------------+-------+---------------------------------------------------------+----------------+---------+-----------------+------+-----------------------------------------------------+
    |  1 | SIMPLE      | news_news       | range | PRIMARY,news_news_date                                  | news_news_date | 3       | NULL            |  190 | Using index condition; Using where; Using temporary |
    |  1 | SIMPLE      | news_news_teams | ref   | news_id,news_news_teams_news_id,news_news_teams_team_id | news_id        | 4       | cs.news_news.id |    1 | Using where; Using index; Distinct                  |
    +----+-------------+-----------------+-------+---------------------------------------------------------+----------------+---------+-----------------+------+-----------------------------------------------------+
    

    PS. DISTINCT was added in order to exclude duplicate news.