Search code examples
sqlmysqldreamhost

How can I improve the performance of this query?


Recently I had this question, and everything worked properly until I sent it to my server at DreamHost.

The query bellow seems to take too long to execute and I can't figure out why so many rows are processed at once. In my local server the same query was executed in 0.3 seconds.

SELECT feed_entries . *
FROM feed_entries
WHERE 
id
IN (

SELECT e.id
FROM feed_entries AS e
INNER JOIN feeds AS f ON e.feed_id = f.id
INNER JOIN entries_categorias AS ec ON ec.entry_id = e.id
INNER JOIN categorias AS c ON ec.categoria_id = c.id
WHERE 
e.deleted =0
AND 
c.slug
IN ('entenda', 'google')
GROUP BY e.id
HAVING COUNT( DISTINCT ec.id ) =2

)
ORDER BY date DESC
LIMIT 1

This is the message I've received from the support team:

Well, it looks like that the query is locked up due to these other queries like this one, processing 6.5 million records, for 11 seconds.

# Query_time: 11.639269  Lock_time: 0.000192 Rows_sent: 2  Rows_examined:
6509098
use desenvolvimentistas;
SET timestamp=1280325753;
SELECT `e`.*, `f`.`titulo` AS `feedTitulo`, `f`.`url` AS `feedUrl` FROM
`feed_entries` AS `e`
 INNER JOIN `feeds` AS `f` ON e.feed_id =f.id WHERE (e.id IN (SELECT
`e`.`id` FROM `feed_entries` AS `e`
 INNER JOIN `feeds` AS `f` ON e.feed_id =f.id
 INNER JOIN `entries_categorias` AS `ec` ON ec.entry_id =e.id
 INNER JOIN `categorias` AS `c` ON ec.categoria_id =c.id WHERE (e.deleted
=0) AND (e.id NOT IN ('', '', '842', '853', '774', '878')) AND
(e.imagem145x145 =1) AND (c.slug IN('destaque-2')) GROUP BY `e`.`id`
HAVING (COUNT(DISTINCT ec.id) =1))) ORDER BY `e`.`date` DESC LIMIT 4;

Likely the query takes .3 seconds to run on your local machine that is idle, but it is slow on our servers because it is running the query 150 at a time, and each one is processing 15.3 million records.

I had our mysql admin took a look at it and he said it was quite inefficient and was surprised that the database was even accessible because the queries had it so locked up. He said the database will be disabled if it continues this way and causes server problems. He said you should fix the rows examined or get a MySQL PS, although a mysql ps will not fix it, but instead will only prevent the database from being disabled.

Here is the query:

# Query_time: 25.944779  Lock_time: 0.000176 Rows_sent: 0  Rows_examined:
15378209
use desenvolvimentistas;
SELECT `feed_entries`.* FROM `feed_entries` WHERE (id IN (SELECT `e`.`id`
FROM `feed_entries` AS `e` INNER JOIN `feeds` AS `f` ON e.feed_id =f.id
INNER JOIN `entries_categorias` AS `ec` ON ec.entry_id =e.id INNER JOIN
`categorias` AS `c` ON ec.categoria_id =c.id WHERE (e.deleted =0) AND
(c.slug IN('entenda','google')) GROUP BY `e`.`id` HAVING (COUNT(DISTINCT
ec.id) =2))) ORDER BY `date` DESC LIMIT 1;

Here is another that needs to be fixed:

# Query_time: 27.010857  Lock_time: 0.000165 Rows_sent: 0  Rows_examined:
15382750
use desenvolvimentistas;
SET timestamp=1280325706;
SELECT `e`.*, `f`.`titulo` AS `feedTitulo`, `f`.`url` AS `feedUrl` FROM
`feed_entries` AS `e`
 INNER JOIN `feeds` AS `f` ON e.feed_id =f.id WHERE (e.id IN (SELECT
`e`.`id` FROM `feed_entries` AS `e`
 INNER JOIN `feeds` AS `f` ON e.feed_id =f.id
 INNER JOIN `entries_categorias` AS `ec` ON ec.entry_id =e.id
 INNER JOIN `categorias` AS `c` ON ec.categoria_id =c.id WHERE (e.deleted
=0) AND (c.slug IN('manchete', 'google')) GROUP BY `e`.`id` HAVING
(COUNT(DISTINCT ec.id) =2))) ORDER BY `e`.`date` DESC LIMIT 4;7:18

Solution

  • Ideally, one query per question please - there's no limit to the number of questions you can ask, and it minimizes chatter associated with each individual query.

    I re-wrote your first query as:

      SELECT t.*
        FROM feed_entries t
       WHERE EXISTS(SELECT NULL
                      FROM feed_entries AS e
                      JOIN feeds AS f ON e.feed_id = f.id
                      JOIN entries_categorias AS ec ON ec.entry_id = e.id
                      JOIN categorias AS c ON ec.categoria_id = c.id
                     WHERE e.deleted = 0
                       AND c.slug IN ('entenda', 'google')
                       AND e.id = t.id
                  GROUP BY e.id
                    HAVING COUNT( DISTINCT ec.id ) = 2)
    ORDER BY date DESC
       LIMIT 1
    

    Index the columns used in the JOIN criteria at a minimum, if not already.

    I saw a comment about a query being run 150 times - could you elaborate?