Search code examples
mysqlsqlpostgresqlgroup-by

GROUP BY only primary key, but select other values


Is there a way to group by a unique (primary) key, essentially giving an implicit guarantee that the other columns from that table will be well-defined?

SELECT myPrimaryKey, otherThing
FROM myTable
GROUP BY myPrimaryKey

I know that I can add the other columns to the statement (GROUP BY myPrimaryKey,otherThing), but I'm trying to avoid that. If you're curious why, read on:


I have a statement which is essentially doing this:

SELECT nodes.node_id, nodes.node_label, COUNT(1)
FROM {a couple of joined tables}
INNER JOIN nodes USING (node_id)
GROUP BY nodes.node_id, nodes.node_label

which works fine, but is a bit slow in MySQL. If I remove nodes.node_label from the GROUP BY, it runs about 10x faster (according to EXPLAIN, this is because one of the earlier joins starts using an index when previously it didn't).

We're in the process of migrating to Postgres, so all new statements are supposed to be compatible with both MySQL and Postgres when possible. Now in Postgres, the original statement runs fast, but the new statement (with the reduced group by) won't run (because Postgres is stricter). In this case, it's a false error because the statement is actually well-defined.

Is there a syntax I can use which will let the same statement run in both platforms, while letting MySQL use just one column in the group by for speed?


Solution

  • You could try converting the other columns into aggregates:

    SELECT myPrimaryKey, MAX(otherThing)
    FROM myTable
    GROUP BY myPrimaryKey