Search code examples
sqlhadoophiveqlimpala

I want to add sum for running value on a column but if sequence fails then we have don't have to add


I have table like this

+----+--------+------+------+
| id | state  | num  | pop  |
+----+--------+------+------+
|  1 | ny     |    1 |  100 |
|  1 | ny     |    2 |  200 |
|  1 | ny     |    3 |  600 |
|  1 | ny     |    6 |  400 |
|  1 | ny     |    7 |  300 |
|  1 | ny     |   14 | 1000 |
|  2 | nj     |    3 |  250 |
+----+--------+------+------+

I want output as below

+---+----+----+------+------+
| 1 | ny |  1 |  100 |  900 |
| 1 | ny |  2 |  200 |  900 |
| 1 | ny |  3 |  600 |  900 |
| 1 | ny |  6 |  400 |  700 |
| 1 | ny |  7 |  300 |  700 |
| 1 | ny | 14 | 1000 | 1000 |
| 2 | nj |  3 |  250 |  250 |
+---+----+----+------+------+

So if there is a seq in num column then we have to add the pop column . So first 3 columns num column has 1,2,3 which is in sequence so we are adding pop column 100+200+600 and displaying as new column.

I tried below code but I am not receiving desired out put

select id, state,num, pop,
sum(pop) over (partition by id, state order by num )
from table

Solution

  • If you subtract a sequence, the values will be constant for the values in a row. Then you can use window functions:

    select t.*,
           sum(pop) over (partition by state, num - seqnum) as new_population
    from (select t.*,
                 row_number() over (partition by state order by num) as seqnum
          from t
         ) t;
    

    Here is a db<>fiddle (using Postgres).