Search code examples
mysqlsqlstandards-complianceansi-sql

Why does MySQL allow "group by" queries WITHOUT aggregate functions?


Surprise -- this is a perfectly valid query in MySQL:

select X, Y from someTable group by X

If you tried this query in Oracle or SQL Server, you’d get the natural error message:

Column 'Y' is invalid in the select list because it is not contained in 
either an aggregate function or the GROUP BY clause.

So how does MySQL determine which Y to show for each X? It just picks one. From what I can tell, it just picks the first Y it finds. The rationale being, if Y is neither an aggregate function nor in the group by clause, then specifying “select Y” in your query makes no sense to begin with. Therefore, I as the database engine will return whatever I want, and you’ll like it.

There’s even a MySQL configuration parameter to turn off this “looseness”. http://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by

This article even mentions how MySQL has been criticized for being ANSI-SQL non-compliant in this regard. http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html

My question is: Why was MySQL designed this way? What was their rationale for breaking with ANSI-SQL?


Solution

  • I believe that it was to handle the case where grouping by one field would imply other fields are also being grouped:

    SELECT user.id, user.name, COUNT(post.*) AS posts 
    FROM user 
      LEFT OUTER JOIN post ON post.owner_id=user.id 
    GROUP BY user.id
    

    In this case the user.name will always be unique per user.id, so there is convenience in not requiring the user.name in the GROUP BY clause (although, as you say, there is definite scope for problems)