Search code examples
python-polars

Polars: group_by rolling sum


Say I have

df = pl.DataFrame({'group': [1, 1, 1, 3, 3, 3, 4, 4], 'value': [1, 4, 2, 5, 3, 4, 2, 3]})

I'd like to get a rolling sum, with window of 2, for each group

Expected output is:

┌───────┐
│ value │
│ ---   │
│ i64   │
╞═══════╡
│ 1     │
│ 5     │
│ 6     │
│ 5     │
│ 8     │
│ 7     │
│ 2     │
│ 5     │
└───────┘

Solution

  • .rolling_sum() can be used with .over()

    min_periods=1 will fill in the nulls.

    df.with_columns(
       pl.col("value").rolling_sum(2, min_periods=1).over("group")
         .alias("rolling_sum_over_group")
    )
    
    shape: (8, 3)
    ┌───────┬───────┬────────────────────────┐
    │ group ┆ value ┆ rolling_sum_over_group │
    │ ---   ┆ ---   ┆ ---                    │
    │ i64   ┆ i64   ┆ i64                    │
    ╞═══════╪═══════╪════════════════════════╡
    │ 1     ┆ 1     ┆ 1                      │
    │ 1     ┆ 4     ┆ 5                      │
    │ 1     ┆ 2     ┆ 6                      │
    │ 3     ┆ 5     ┆ 5                      │
    │ 3     ┆ 3     ┆ 8                      │
    │ 3     ┆ 4     ┆ 7                      │
    │ 4     ┆ 2     ┆ 2                      │
    │ 4     ┆ 3     ┆ 5                      │
    └───────┴───────┴────────────────────────┘