Search code examples
sqlprestotrino

Get 1 row of data per day for a given user in PrestoDB


We have a very large number of rows (10's of millions) for a specific customer and I just want to pull 1 row of data for a specific user of the customer per day for a whole month. Here is a sample of the query for a single day.

select update_date, name, city
from mydb.mytable
WHERE year = '2021' and month = '05' and dd = '01'
  and customer_id = 'acme' and user_id = 'user123' 
limit 1 

How do I do this for each day of a month so that I get only 1 row per day for that one customer/user in the most efficient (fastest) manner using PrestDB?

EDIT: I should add that if there is at least 1 row, the single day query returns rather quickly, but if there are no rows for that customer/user for a given day, then it has to look through all the rows to get nothing and that is what takes a long time. I don't think anything but indexing would help solve this, right?


Solution

  • Use row_number():

    select update_date, name, city
    from (select t.*,
                 row_number() over (partition by user_id, year, month, day order by update_date desc) as seqnum
          from mydb.mytable t
          where year = '2021' and month = '05' and
                user_id = 'user123' and customer_id = 'acme'
         ) t
    where seqnum = 1;