Search code examples
mysqldatabaseperformancegroup-bycovering-index

Why does the following query copy table data?


SELECT COUNT(*) AS cnt
FROM products
WHERE ExternalProductId IS NOT NULL
GROUP BY SourceId, ExternalProductId
HAVING cnt > 1

There is an index on (ExternalProductId, SourceId, AnotherField). An Explain shows the index is used. This is printed in the "Extra" column of explain:

Using where; Using index; Using temporary; Using filesort

When I run the query I see via SHOW PROCESSLIST:

Copying to tmp table on disk

Can I tweak this query to work in place on the index? I also don't mind if the results I get are slightly inaccurate because of other processes simultaneously working on this table - can I change the isolation level to increase the query's performance?


Solution

  • If you reverse the columns in your GROUP BY to correspond with the ordering of the first two fields of your compound index, it'll use your compound index much more effectively.

    SELECT COUNT(*) AS cnt
    FROM products
    WHERE ExternalProductId IS NOT NULL
    GROUP BY ExternalProductId, SourceId
    HAVING cnt > 1
    

    Your query execution plain should turn into 'Using where; Using index', and get rid of both the temporary table and filesort caused by the other GROUP BY.

    You'll still get the same results, but it'll be in a slightly different order.