Search code examples
sqlpostgresqlpostgresql-9.6

Fetch the row which has the latest occurrence based on multiple ids


I am attempting to find the last message per user per session. Each message row has the user id and which user session it is (incrementing by 1 for each session)

My table looks like this:

user_id | timestamp                  | usermessage   | user_session_id

1       | 2019-08-28 14:50:39.150000 | hi            | 1
1       | 2019-08-28 14:50:40.150000 | goodbye       | 1
1       | 2019-09-01 10:50:39.150000 | hello again   | 2
1       | 2019-09-01 11:50:39.150000 | goodbye again | 2
2       | 2019-07-09 07:53:56.680000 | hello         | 1 
2       | 2019-07-10 09:23:16.100000 | hi there      | 2     

I am looking to retrieve the last message (latest timestamp) each user posted per session. So my expected output would be something like this

user_id | timestamp                  | usermessage   | user_session_id
1       | 2019-08-28 14:50:40.150000 | goodbye       | 1
1       | 2019-09-01 11:50:39.150000 | goodbye again | 2
2       | 2019-07-09 07:53:56.680000 | hello         | 1 
2       | 2019-07-10 09:23:16.100000 | hi there      | 2       

Thank you in advance


Solution

  • In Postgres, I would recommend distinct on:

    select distinct on (user_id, user_session_id) t.*
    from t
    order by user_id, user_session_id, timestamp desc;
    

    In Postgres, this is usually the most efficient method for doing this type of query. For best performance, you want an index on (user_id, user_session_id, timestamp desc).