Search code examples
postgresqljoingroup-byuniongreatest-n-per-group

PostgreSQL distinct and group on different fields


With the following query I can get the list of project members added in the memberships table, union'ed with the the projects owners (who may not have an entry in the memberships table)

select sub.user, sub.project, sub.role, sub.order, sub.name from
(SELECT 
    memberships."user",
    memberships.project,
    memberships.role,
    roles."order",
    roles.name
FROM memberships
    JOIN roles ON roles.id = memberships.role
UNION
    SELECT projects.owner AS "user",
        projects.id AS project,
        1 AS role,
        0 AS "order",
        'admin'::text AS name
    FROM projects
) as sub

The above query yields the following result set.

8   2   1   0   "admin"
8   1   3   2   "contributor"       (added through memberships table)
6   1   1   0   "admin"
8   4   1   0   "admin"
8   1   1   0   "admin"             (duplicate because user #8 is the owner of project #1)

Now I want to remove the duplicate entries by taking the contents of the row that has least order. using distinct on (sub.order) does not include all rows

select distinct on (sub.order) * from
-- the same subquery
order by sub.order

The above yields

8   2   1   0   "admin"
8   1   3   2   "contributor"

Using group by sub.user, sub.project and aggregating min(sub.order) works but the other two fields like role and name is left out

select sub.user, sub.project, min(sub.order) from
-- the same subquery
group by sub.user, sub.project

I want the role, name and order of the row that has the minimum order when grouped with user, project pair


Solution

  • I want the role, name and order of the row that has the minimum order when grouped with user, project pair

    The distinct on must enumerate the "grouping" columns - then the order by clause must contain the same columns, followed by the column(s) to use to break the ties.

    You probably want:

    select distinct on (t.user, t.project) * 
    from ( 
        -- the same subquery -- 
    ) t
    order by t.user, t.project, t.order