Search code examples
sqlsumhiveqlwindow-functions

SUM OVER () with Group by and preceding


I have a code that works well - with just the dates by phone usage.

select  date,
        data,
        SUM(data) OVER (ORDER BY date asc
                            ROWS between 30 PRECEDING and current row) data_30,
        texts,
        SUM(tests) OVER (ORDER BY date asc
                ROWS between 30 PRECEDING and current row) text_30,
        voice,
        SUM(voice) OVER (ORDER BY date asc
                ROWS between 30 PRECEDING and current row) voice_30,
        wifi,
        SUM(wifi) 
                OVER (ORDER BY date asc
                ROWS between 30 PRECEDING and current row) wifi_30
    FROM Table

I am just figuring out how to use sum overs the 30 preceding day, but is it possible at all to include a second variable say, I want to see date, by rate plan for these usages?

Something like

select  date,
        plan, b, c, d, 
        data,
        SUM(data) OVER (ORDER BY date asc
                            ROWS between 30 PRECEDING and current row) data_30,
        texts,
        SUM(tests) OVER (ORDER BY date asc
                ROWS between 30 PRECEDING and current row) text_30,
        voice,
        SUM(voice) OVER (ORDER BY date asc
                ROWS between 30 PRECEDING and current row) voice_30,
        wifi,
        SUM(wifi) 
                OVER (ORDER BY date asc
                ROWS between 30 PRECEDING and current row) wifi_30
    FROM Table
    group by date, plan, b, c, d

Results of just date

date  usage  last30sum
1/1   1       1
1/2   1       2
....
1/20  1       30

If I had source data of

date line     rateplan        usage  
1/1  phone1   10gbplan        1
1/1  phone2   unlimited       2                
1/2  phone3   10gbplan        1                
....
1/30 phone200 10gbplan        1                

I want to see sorted out

date plan        totalusage   rolling_30
1/2  10gbplan   1            4+ sum(28 days before 1/2)

Can you group it by date, unl to get the last 30 days that are unl, a, b, c just means other group bys, could be like by device model, or area.


Solution

  • I figure it out by adding in partition by

    select  date,
            plan, b, c, d, 
            data,
            SUM(data) OVER (partition by plan, b, c, d
                            ORDER BY date asc
                                ROWS between 30 PRECEDING and current row) data_30,
            texts,
            SUM(tests) OVER (partition by plan, b, c, d
                    ORDER BY date asc
                    ROWS between 30 PRECEDING and current row) text_30,
            voice,
            SUM(voice) OVER (partition by plan, b, c, d
                    ORDER BY date asc
                    ROWS between 30 PRECEDING and current row) voice_30,
            wifi,
            SUM(wifi) 
                    OVER (partition by plan, b, c, d
                    ORDER BY date asc
                    ROWS between 30 PRECEDING and current row) wifi_30
        FROM Table