Search code examples
python-polars

Look ahead in group_by_dynamic


When performing a group_by_dynamic on a index columns and aggregating a columns in a list (as to see which values are in the group), for a given group and a given index i, the values in the list correspond to the value in the correct group, but for index i, i+1, ... until the period. This look ahead in the index seems to contrast with the rolling_mean function which (by default) the values of lower indices, not higher, to perform the rolling mean.

Is it intentionally designed this way, and if so, how can one perform a group_by_dynamic using lower indices? (i am not sure what the offset parameter does, but not what i want here)

Here is an example that i expected not to raise

import polars as pl

df = pl.DataFrame(
    {
        "index": [0, 0, 1, 1],
        "group": ["banana", "pear", "banana", "pear"],
        "weight": [2, 3, 5, 7],
    }
)

agg = df.group_by_dynamic("index", group_by="group", every="1i", period="2i").agg(pl.col("weight"))

assert((
    agg
    .filter(index=0, group="banana")
    .select("weight")
    .to_series()
    .to_list()
) == [[2]])

Thank you


Solution

  • As mentioned, a solution can be rolling.

    With your example:

    df.group_by_dynamic("index", group_by="group", every="1i", period="2i").agg(pl.col("weight"))
    
    shape: (4, 3)
    ┌────────┬───────┬───────────┐
    │ group  ┆ index ┆ weight    │
    │ ---    ┆ ---   ┆ ---       │
    │ str    ┆ i64   ┆ list[i64] │
    ╞════════╪═══════╪═══════════╡
    │ banana ┆ 0     ┆ [2, 5]    │
    │ banana ┆ 1     ┆ [5]       │
    │ pear   ┆ 0     ┆ [3, 7]    │
    │ pear   ┆ 1     ┆ [7]       │
    └────────┴───────┴───────────┘
    

    With rolling:

    df.rolling(index_column="index", period="2i", group_by="group").agg(pl.col("weight"))
    
    shape: (4, 3)
    ┌────────┬───────┬───────────┐
    │ group  ┆ index ┆ weight    │
    │ ---    ┆ ---   ┆ ---       │
    │ str    ┆ i64   ┆ list[i64] │
    ╞════════╪═══════╪═══════════╡
    │ banana ┆ 0     ┆ [2]       │
    │ banana ┆ 1     ┆ [2, 5]    │
    │ pear   ┆ 0     ┆ [3]       │
    │ pear   ┆ 1     ┆ [3, 7]    │
    └────────┴───────┴───────────┘