I have this table called stories
that currently has 12 million records, on production.
CREATE TABLE `stories` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`headline` varchar(255) DEFAULT NULL,
`author_id` int(11) DEFAULT NULL,
`body` longtext NOT NULL,
`published_at` datetime DEFAULT NULL,
`type_id` int(11) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`aasm_state` varchar(255) NOT NULL,
`deleted` tinyint(1) DEFAULT '0',
`word_count` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `index_stories_on_cms_story_id` (`cms_story_id`),
KEY `typeid` (`type_id`),
KEY `index_stories_on_published_at` (`published_at`),
KEY `index_stories_on_updated_at` (`updated_at`),
KEY `index_stories_on_aasm_state_and_published_at_and_deleted` (`aasm_state`,`published_at`,`deleted`),
KEY `idx_author_id` (`author_id`)
) ENGINE=InnoDB AUTO_INCREMENT=511625276 DEFAULT CHARSET=utf8;
And I am performing the following queries: (just fetching the id runs fine)
SELECT `stories`.id
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER
BY `stories`.`id` ASC
LIMIT 1000;
...
1000 rows in set (0.59 sec)
However, when I add the longtext column to it, I get:
mysql> SELECT `stories`.id
, `stories`.body
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER BY `stories`.`id` ASC LIMIT 1000;
...
1000 rows in set (6 min 34.11 sec)
Any performance tip on how to deal with this table?
Typically a relational DBMS will apply ORDER BY
after retrieving the initial result set - so it needs to load up all those stories then sort them. I don't have access to your record set, but at a guess, applying the sorting before retrieving the bulk content may improve performance:
SELECT *
FROM (
SELECT `stories`.id
FROM `stories`
WHERE `stories`.`aasm_state` = 'scheduled'
AND `stories`.`deleted` = 0
AND (`stories`.`published_at` <= '2020-01-14 06:16:04')
AND (`stories`.`id` > 519492608)
ORDER BY `stories`.`id` ASC
LIMIT 1000
) ids
INNER JOIN stories bulk
ON ids.id=bulk.id
(BTW you might consider researching indexes more - what you have put here looks rather suspect).