Search code examples
python-polars

How to group_by and rolling in polars?


import polars as pl

data = {'type': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'],
        'value': [5, 9, 1, 0, 3, 2, 5, 8, 9, 1, 0, 3, 3, 1, 1, 0, 2, 0, 0, 5, 7, 4, 7, 8, 9, 11, 1, 1, 0, 1, 4, 3, 21]}
df = pl.DataFrame(data)
print(df)

Given two columns of data, how can we group them by the 'type' column, sum the 'value' column using a rolling window of size 5, and then place the resulting data into a column named 'result'?

The results are as follows:

[None, None, None, None, 18, 15, 11, 18, 27, 25, 23, 21, 16, None, None, None, None, 4, 3, 7, 14, 16, None, None, None, None, 36, 30, 22, 14, 7, 9, 29]

(Please using the polars library only, Polars version = 0.17.9)


Solution

  • .rolling_sum and .over

    df.with_columns(result = 
       pl.col("value").rolling_sum(window_size=5).over("type")
    )
    
    shape: (33, 3)
    ┌──────┬───────┬────────┐
    │ type ┆ value ┆ result │
    │ ---  ┆ ---   ┆ ---    │
    │ str  ┆ i64   ┆ i64    │
    ╞══════╪═══════╪════════╡
    │ A    ┆ 5     ┆ null   │
    │ A    ┆ 9     ┆ null   │
    │ A    ┆ 1     ┆ null   │
    │ A    ┆ 0     ┆ null   │
    │ …    ┆ …     ┆ …      │
    │ C    ┆ 1     ┆ 14     │
    │ C    ┆ 4     ┆ 7      │
    │ C    ┆ 3     ┆ 9      │
    │ C    ┆ 21    ┆ 29     │
    └──────┴───────┴────────┘