Search code examples
sqlprestotrino

SQL/Presto: how to get cumulative sum for a startime of items over ds


I have a data with the following format. I want to count how many items have been created over time.

-- table1
user_id  item_id  start_time  ds
 1         1       1/2/2021   1/1/2021
 1         2       1/1/2021   1/3/2021
 2         5       1/4/2021   1/4/2021
 2         3       1/4/2021   1/4/2021
 2         4       1/5/2021   1/6/2021

the desired result is:

   cum_count  ds
    1          1/1/2021
    2          1/3/2021
    4          1/4/2021
    5          1/6/2021

I think i should get something with window functions but get confused in the partion by and order by part.

What I have for now is

select ds, count(start_time) over (partition by user_id order by ds) as cum_count 
from table1

but I do not need user level specific cum_count now.


Solution

  • I think you want aggregation as well as a cumulative sum:

    select ds, count(*),
           sum(count(*)) over (order by ds) as running_count
    from table1
    group by ds
    order by ds;