I have three tables that are concerned by this query
CREATE TABLE `tags` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`latName` varchar(191) NOT NULL,
`araName` varchar(191) NOT NULL,
`active` tinyint(1) NOT NULL DEFAULT 0,
`img_name` varchar(191) DEFAULT NULL,
`icon` varchar(191) DEFAULT NULL,
`rgba_color` varchar(191) DEFAULT NULL,
`color` varchar(191) DEFAULT NULL,
`overlay` varchar(191) DEFAULT NULL,
`position` int(11) NOT NULL,
`mdi_icon` varchar(191) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `tags_latname_unique` (`latName`),
UNIQUE KEY `tags_araname_unique` (`araName`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb3
CREATE TABLE `newspapers` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`latName` varchar(191) NOT NULL,
`araName` varchar(191) NOT NULL,
`img_name` varchar(191) DEFAULT NULL,
`active` tinyint(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
UNIQUE KEY `newspapers_latname_unique` (`latName`),
UNIQUE KEY `newspapers_araname_unique` (`araName`)
) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=utf8mb3
CREATE TABLE `articles` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`newspaper_id` bigint(20) unsigned NOT NULL,
`tag_id` bigint(20) unsigned NOT NULL,
`seen` int(10) unsigned NOT NULL,
`link` varchar(1000) NOT NULL,
`title` varchar(191) NOT NULL,
`img_name` varchar(191) NOT NULL,
`date` datetime NOT NULL,
`paragraph` text NOT NULL,
`read_time` int(11) DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `articles_link_unique` (`link`),
UNIQUE KEY `articles_img_name_unique` (`img_name`),
KEY `articles_newspaper_id_foreign` (`newspaper_id`),
KEY `articles_tag_id_foreign` (`tag_id`),
CONSTRAINT `articles_newspaper_id_foreign` FOREIGN KEY (`newspaper_id`) REFERENCES `newspapers` (`id`),
CONSTRAINT `articles_tag_id_foreign` FOREIGN KEY (`tag_id`) REFERENCES `tags` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=47421 DEFAULT CHARSET=utf8mb3
Basically, I want to load the latest 5 articles (ordered by date) that have an active newspaper and active tag.
Right now articles table contains about 40k entries.
This is the query generated by Laravel's query builder
SELECT `articles`.*
FROM `articles`
INNER JOIN `tags` ON `tags`.`id` = `articles`.`tag_id`
AND `tags`.`active` = 1
INNER JOIN `newspapers` ON `newspapers`.`id` = `articles`.`newspaper_id`
AND `newspapers`.`active` = 1
ORDER BY `date` DESC
LIMIT 5;
It takes Mysql about 6sec to run the query, when I remove the ORDER BY
clause, the query becomes very fast (0.001sec).
Here is the query explanation:
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
| 1 | SIMPLE | newspapers | ALL | PRIMARY | NULL | NULL | NULL | 18 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | articles | ref | articles_newspaper_id_foreign,articles_tag_id_foreign | articles_newspaper_id_foreign | 8 | mouhim.newspapers.id | 1127 | |
| 1 | SIMPLE | tags | eq_ref | PRIMARY | PRIMARY | 8 | mouhim.articles.tag_id | 1 | Using where |
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
I tried creating an index on the date attribute but it didn't help.
for convenience, this is how I am using Query Builder for this query:
Article::select("articles.*")
->join("tags", function ($join) {
$join->on("tags.id", "articles.tag_id")
->where("tags.active", 1);
})
->join("newspapers", function ($join) {
$join->on("newspapers.id", "articles.newspaper_id")
->where("newspapers.active", 1);
})
->orderBy("date", "desc")
->paginate(5)
At first, I was using Eloquent (whereHas) but Eloquent was generating non optimized query using (where exists), so I had to go the joins way.
What can I do to improve execution time of this query?
Result of SHOW INDEXES FROM articles;
+----------+------------+-------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+----------+------------+-------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| articles | 0 | PRIMARY | 1 | id | A | 36072 | NULL | NULL | | BTREE | | | NO |
| articles | 0 | articles_link_unique | 1 | link | A | 36072 | NULL | NULL | | BTREE | | | NO |
| articles | 0 | articles_img_name_unique | 1 | img_name | A | 36072 | NULL | NULL | | BTREE | | | NO |
| articles | 1 | articles_newspaper_id_foreign | 1 | newspaper_id | A | 32 | NULL | NULL | | BTREE | | | NO |
| articles | 1 | articles_tag_id_foreign | 1 | tag_id | A | 12 | NULL | NULL | | BTREE | | | NO |
| articles | 1 | data | 1 | date | A | 36072 | NULL | NULL | | BTREE | | | NO |
+----------+------------+-------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
This query was suggested by Rick James as a solution
SELECT `articles`.*
FROM `articles`
WHERE EXISTS ( SELECT 1 FROM tags WHERE id = `articles`.`tag_id` and active = 1)
AND EXISTS ( SELECT 1 FROM newspapers WHERE id = `articles`.`newspaper_id` and active = 1)
ORDER BY `date` DESC
LIMIT 5;
Running EXPLAIN on this query yields the following result
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
| 1 | PRIMARY | newspapers | ALL | PRIMARY | NULL | NULL | NULL | 18 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | articles | ref | articles_newspaper_id_foreign,articles_tag_id_foreign | articles_newspaper_id_foreign | 8 | mouhim.newspapers.id | 1127 | |
| 1 | PRIMARY | tags | eq_ref | PRIMARY | PRIMARY | 8 | mouhim.articles.tag_id | 1 | Using where |
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
Assuming you don't want dups, change to this; it is likely to be much faster:
SELECT `articles`.*
FROM `articles`
WHERE EXISTS ( SELECT 1 FROM tags
WHERE id = `articles`.`tag_id` )
AND EXISTS ( SELECT 1 FROM newspapers
WHERE id = `articles`.`newspaper_id` )
ORDER BY `date` DESC
LIMIT 5;
Also, have this index on articles
:
INDEX(date)
(This is a rare use case for starting index with a column that will be used in a 'range'.)
(Sorry, I don't speak 'Laravel'; maybe someone else can help with that part.)
PS. Having 3 UNIQUE keys on a table is highly unusual. It often indicates a problem with the schema design.
each article has one and only one Tag associated with it
Can multiple articles have the same Tag?
when I remove the ORDER BY clause, the query becomes very fast (0.001sec).
That is because you get whatever 5 rows are easy to return to you. Clearly the ORDER BY
is part of the requirement. "Using temporary; Using filesort" says there was at least a sort. It will actually be a "file" sort -- because SELECT *
includes a TEXT
column. (There is a technique to avoid "file", but I don't think it is needed here.)