Search code examples
sqlpostgresqlpivotgreatest-n-per-group

How to select the first and last row relevant to an item in a table?


I have the following events tables that track events related to various items: when an item was created, when it was updated, and finally when it was deleted.

For example it would look like this:

id item_id type
1 1 create
2 2 create
3 1 update
4 1 update
5 2 update
6 1 delete
7 2 update
8 3 create

So in this table item 1 was created, then item 2 was created, then both items were updated multiple times. In the end item 1 was deleted, but item 2 is still there. Eventually item 3 was created and never modified.

I'm trying to create a query that would give me the first and last event for each item, so after running this query I would get this:

id item_id first_event_type last_event_type
1 1 create delete
2 2 create update
8 3 create NULL

I've tried various queries, but didn't get much further than joining the table on itself, as I don't know how to express that the joined table should return the last result relevant to the current item_id.

Any idea if it is possible to do this with just one query?

(If that can make a difference I am using PostgresSQL)


Solution

  • You can use window functions and conditional aggregation:

    select item_id,
        max(type) filter(where rn_asc = 1) as first_event_type,
        max(type) filter(where rn_desc = 1 and rn_asc > 1) as last_event_type
    from (
        select t.*,
            row_number() over(partition by item_id order by id) rn_asc,
            row_number() over(partition by item_id order by id desc) rn_desc
        from mytable t
    ) t
    where 1 in (rn_asc, rn_desc)
    group by item_id