Search code examples
mysqlsqlsqlperformance

MySQL adding longtext column making query extremely slow - any performance tip?


I have this table called stories that currently has 12 million records, on production.

CREATE TABLE `stories` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `headline` varchar(255) DEFAULT NULL,
  `author_id` int(11) DEFAULT NULL,
  `body` longtext NOT NULL,
  `published_at` datetime DEFAULT NULL,
  `type_id` int(11) NOT NULL DEFAULT '0',
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `aasm_state` varchar(255) NOT NULL,
  `deleted` tinyint(1) DEFAULT '0',
  `word_count` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `index_stories_on_cms_story_id` (`cms_story_id`),
  KEY `typeid` (`type_id`),
  KEY `index_stories_on_published_at` (`published_at`),
  KEY `index_stories_on_updated_at` (`updated_at`),
  KEY `index_stories_on_aasm_state_and_published_at_and_deleted` (`aasm_state`,`published_at`,`deleted`),
  KEY `idx_author_id` (`author_id`)
) ENGINE=InnoDB AUTO_INCREMENT=511625276 DEFAULT CHARSET=utf8;

And I am performing the following queries: (just fetching the id runs fine)

SELECT  `stories`.id 
  FROM `stories` 
 WHERE `stories`.`aasm_state` = 'scheduled'  
   AND `stories`.`deleted` = 0 
   AND (`stories`.`published_at` <= '2020-01-14 06:16:04') 
   AND (`stories`.`id` > 519492608)  
 ORDER 
    BY `stories`.`id` ASC 
  LIMIT 1000;
...
1000 rows in set (0.59 sec)

However, when I add the longtext column to it, I get:

mysql> SELECT  `stories`.id
, `stories`.body 
FROM `stories` 
WHERE `stories`.`aasm_state` = 'scheduled' 
AND `stories`.`deleted` = 0 
AND (`stories`.`published_at` <= '2020-01-14 06:16:04') 
AND (`stories`.`id` > 519492608)  
ORDER BY `stories`.`id` ASC LIMIT 1000;
...
1000 rows in set (6 min 34.11 sec)

Any performance tip on how to deal with this table?


Solution

  • Typically a relational DBMS will apply ORDER BY after retrieving the initial result set - so it needs to load up all those stories then sort them. I don't have access to your record set, but at a guess, applying the sorting before retrieving the bulk content may improve performance:

    SELECT *
    FROM (
       SELECT  `stories`.id 
       FROM `stories` 
       WHERE `stories`.`aasm_state` = 'scheduled'  
       AND `stories`.`deleted` = 0 
       AND (`stories`.`published_at` <= '2020-01-14 06:16:04') 
       AND (`stories`.`id` > 519492608)  
       ORDER BY `stories`.`id` ASC 
       LIMIT 1000
    ) ids 
    INNER JOIN stories bulk
    ON ids.id=bulk.id
    

    (BTW you might consider researching indexes more - what you have put here looks rather suspect).