Search code examples
sqlpostgresqlgreatest-n-per-grouppostgresql-9.1rails-postgresql

Get "latest" row after GROUP BY over multiple tables


I'd preferably like to first query listed below and just group by stories.id, but I get the following error:

ERROR: column "u.first_name" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT "s".*, "u"."first_name", "u"."last_name", ("i"."filen...

The second query works but does not group by stories.id and generates the wrong results. Is it possible to select from multiple tables and not group by all of them?

The table panels also has a column updated_at. I would like to get the newest file per story according to panels.updated_at.

   SELECT 
        "s".*, 
        "u"."first_name", 
        "u"."last_name", 
        ("i"."filename" || '.' || "i"."extension") AS "file" 
    FROM 
        "stories" "s" 
    LEFT JOIN "panels" "p" ON("p"."story_id" = "s"."id") 
    LEFT JOIN "users" "u" ON("s"."user_id" = "u"."uid")
    LEFT JOIN "images" "i" ON ("p"."image_id" = "i"."id") 
    WHERE 
        "s"."complete" = false AND 
        "s"."created_by" = 205700489 
    GROUP BY 
        "s"."id", 
    ORDER BY 
        "s"."created_at" DESC

   SELECT 
        "s".*, 
        "u"."first_name", 
        "u"."last_name", 
        ("i"."filename" || '.' || "i"."extension") AS "file" 
    FROM 
        "stories" "s" 
    LEFT JOIN "panels" "p" ON("p"."story_id" = "s"."id") 
    LEFT JOIN "users" "u" ON("s"."user_id" = "u"."uid")
    LEFT JOIN "images" "i" ON ("p"."image_id" = "i"."id") 
    WHERE 
        "s"."complete" = false AND 
        "s"."created_by" = 205700489 
    GROUP BY 
        "s"."id", 
        "u"."first_name", 
        "u"."last_name", "i"."filename", 
        "i"."extension" 
    ORDER BY 
        "s"."created_at" DESC

Solution

  • Updated after clarification of the question:

    SELECT DISTINCT ON (s.created_at, s.id)
           s.*
          ,u.first_name
          ,u.last_name
          ,concat_ws('.', i.filename, i.extension) AS file
    FROM   stories s 
    LEFT   JOIN users  u ON u.uid = s.user_id
    LEFT   JOIN panels p ON p.story_id = s.id
    LEFT   JOIN images i ON i.id = p.image_id
    WHERE  s.complete = false
    AND    s.created_by = 205700489 
    ORDER  BY s.created_at DESC, s.id, p.updated_at DESC;
    

    Grouping by primary key requires PostgreSQL 9.1.
    I use concat_ws(), because I don't know which columns might be NULL. If both i.filename and i.extension are defined NOT NULL, you can simplify.

    Effect of the additional ORDER BY item p.updated_at DESC is that the "newest" file will be picked per story. The query technique is explained in full under this related question:
    Select first row in each GROUP BY group?