Search code examples
sqlpostgresqldatetimesql-order-bygreatest-n-per-group

Select newest entry for each user without using group by (postgres)


I have a table myTable with four columns:

id        UUID,
user_id   UUID ,
text      VARCHAR ,
date      TIMESTAMP

(id is the primary key and user_id is not unique in this table)

I want to retrieve the user_ids ordered by their newest entry, which i am currently doing with this query:

SELECT user_id FROM myTable GROUP BY user_id ORDER BY MAX(date) DESC

The problem is that GROUP BY takes a long time. Is there a faster way to accomplish this? I tried using a window function with PARTITION BY as described here Retrieving the last record in each group - MySQL, but it didn't really speed things up. I've also made sure that user_id is indexed.

My postgres version is 10.4

Edit: The query above that I'm currently using is functionally correct, the problem is that it's slow.


Solution

  • Start with an index on user_id, date desc. That might help.

    You can also try filtering -- once you have such an index:

    select t.user_id
    from myTable t
    where t.date = (select max(t2.date)
                    from myTable t2
                    where t2.user_id = t.user_id
                   )
    order by t.date desc
    

    However, you might find that the order by ends up taking almost as much time as the group by.

    This version will definitely use the index for the subquery:

    select user_id
    from (select distinct on (user_id) user_id, date
          from myTable t
          order by user_id, date desc
         ) t
    order by date desc;