I found this solution online to a HackerRank problem:
SELECT c.hacker_id, h.name, COUNT(c.challenge_id) AS cnt
FROM Hackers AS h
JOIN Challenges AS c ON h.hacker_id = c.hacker_id
GROUP BY c.hacker_id, h.name
HAVING
cnt = (SELECT COUNT(c1.challenge_id) FROM Challenges AS c1 GROUP BY c1.hacker_id ORDER BY COUNT(*) DESC LIMIT 1) OR
cnt NOT IN (SELECT COUNT(c2.challenge_id) FROM Challenges AS c2 GROUP BY c2.hacker_id HAVING c2.hacker_id <> c.hacker_id)
ORDER BY cnt DESC, c.hacker_id;
As you can see, cnt
is defined in SELECT, and then used in HAVING. My understanding was that, logically speaking, GROUP BY and HAVING run before SELECT. Therefore I would expect that SQL would not recognize cnt
when executing GROUP BY, and throw an error. However, the code gives the correct solution.
Can anyone explain why? Maybe MySQL corrects for this artificially?
Your assumption that aliases defined in the SELECT
clause cannot be used in the HAVING
clause is generally true: this is not allowed in standard SQL (also called ANSI SQL), and most databases do not allow that.
However, MySQL extends the standard, as explained in the documentation:
The MySQL extension permits the use of an alias in the
HAVING
clause for the aggregated column.
So the code you are showing is not valid ANSI SQL, but it is a valid MySQL query. There are many other extensions to ANSI SQL that MySQL implements, and that are described in the linked documentation.
More generally speaking, there is no database that I know about that fully and strictly complies with the ANSI spec. They all have missing features and extensions. Learning SQL, you also need to learn the specificities of the database(s) you are running.