Search code examples
mysqlsqlgroup-bycounthaving-clause

Why is this code able to use an alias from SELECT in HAVING?


I found this solution online to a HackerRank problem:

SELECT c.hacker_id, h.name, COUNT(c.challenge_id) AS cnt 
FROM Hackers AS h 
JOIN Challenges AS c ON h.hacker_id = c.hacker_id
GROUP BY c.hacker_id, h.name 
HAVING
    cnt = (SELECT COUNT(c1.challenge_id) FROM Challenges AS c1 GROUP BY c1.hacker_id ORDER BY COUNT(*) DESC LIMIT 1) OR
    cnt NOT IN (SELECT COUNT(c2.challenge_id) FROM Challenges AS c2 GROUP BY c2.hacker_id HAVING c2.hacker_id <> c.hacker_id)
ORDER BY cnt DESC, c.hacker_id;

As you can see, cnt is defined in SELECT, and then used in HAVING. My understanding was that, logically speaking, GROUP BY and HAVING run before SELECT. Therefore I would expect that SQL would not recognize cnt when executing GROUP BY, and throw an error. However, the code gives the correct solution.

Can anyone explain why? Maybe MySQL corrects for this artificially?


Solution

  • Your assumption that aliases defined in the SELECT clause cannot be used in the HAVING clause is generally true: this is not allowed in standard SQL (also called ANSI SQL), and most databases do not allow that.

    However, MySQL extends the standard, as explained in the documentation:

    The MySQL extension permits the use of an alias in the HAVING clause for the aggregated column.

    So the code you are showing is not valid ANSI SQL, but it is a valid MySQL query. There are many other extensions to ANSI SQL that MySQL implements, and that are described in the linked documentation.

    More generally speaking, there is no database that I know about that fully and strictly complies with the ANSI spec. They all have missing features and extensions. Learning SQL, you also need to learn the specificities of the database(s) you are running.