Search code examples
mysqlgroup-byquery-optimizationhaving

Does MySQL eliminate common subexpressions between SELECT and HAVING/GROUP BY clause


I often see people answer MySQL questions with queries like this:

SELECT DAY(date), other columns
FROM table
GROUP BY DAY(date);

SELECT somecolumn, COUNT(*)
FROM table
HAVING COUNT(*) > 1;

I always like to give the column an alias and refer to that in the GROUP BY or HAVING clause, e.g.

SELECT DAY(date) AS day, other columns
FROM table
GROUP BY day;

SELECT somecolumn, COUNT(*) AS c
FROM table
HAVING c > 1;

Is MySQL smart enough to notice that the expressions in the later clauses are the same as in SELECT, and only do it once? I'm not sure how to test this -- EXPLAIN doesn't show any difference, but it doesn't seem to show how it's doing the grouping or filtering in the first place; it seems mainly useful for optimizing joins and WHERE clauses.

I tend to be pessimistic about MySQL optimization, so I like to give it all the help I can.


Solution

  • I think this can be tested using sleep() function,
    for example take a look at this demo: http://sqlfiddle.com/#!2/0bc1b/1

    Select * FROM t;
    
    | X |
    |---|
    | 1 |
    | 2 |
    | 2 |
    
    SELECT x+sleep(1)
    FROM t
    GROUP BY x+sleep(1);
    
    SELECT x+sleep(1) As name
    FROM t
    GROUP BY name;
    

    Execution times of both queries are about 3000 ms ( 3 seconds ).
    There are 3 records in the table, and for each record the query sleeps for 1 second only,
    so it means that the expression is evaluated only once for each record, not twice.