Search code examples
pythonpython-polars

Fill in the previous value from specific column based on a condition


I have a Polars DataFrame that looks something like so:

df = pl.DataFrame(
    {
        'val': [9, 7, 9, 11, 2, 5],
        'count': [1, 2, 1, 2, 1, 2],
        'id': [1, 1, 2, 2, 3, 3],
    }
)

What I need is to create a new column 'prev_val' which will contain values for the same unique id taken from a row where the value in 'count' column is smaller by one, i.e. looking something like:

┌─────┬───────┬─────┬──────────┐
│ val ┆ count ┆ id  ┆ prev_val │
│ --- ┆ ---   ┆ --- ┆ ---      │
│ i64 ┆ i64   ┆ i64 ┆ i64      │
╞═════╪═══════╪═════╪══════════╡
│ 9   ┆ 1     ┆ 1   ┆ null     │
│ 7   ┆ 2     ┆ 1   ┆ 9        │
│ 9   ┆ 1     ┆ 2   ┆ null     │
│ 11  ┆ 2     ┆ 2   ┆ 9        │
│ 2   ┆ 1     ┆ 3   ┆ null     │
│ 5   ┆ 2     ┆ 3   ┆ 2        │
└─────┴───────┴─────┴──────────┘

I couldn't figure a way of using native expressions so I tried doing this using map_rows and a UDF, even though Polars guide discourages the use of UDFs.

Is there maybe a native way to do this?


Solution

  • It looks like something you would express as a .join()

    df.join(df.with_columns(pl.col.count + 1), on=["id", "count"], how="left")
    
    shape: (6, 4)
    ┌─────┬───────┬─────┬───────────┐
    │ val ┆ count ┆ id  ┆ val_right │
    │ --- ┆ ---   ┆ --- ┆ ---       │
    │ i64 ┆ i64   ┆ i64 ┆ i64       │
    ╞═════╪═══════╪═════╪═══════════╡
    │ 9   ┆ 1     ┆ 1   ┆ null      │
    │ 7   ┆ 2     ┆ 1   ┆ 9         │
    │ 9   ┆ 1     ┆ 2   ┆ null      │
    │ 11  ┆ 2     ┆ 2   ┆ 9         │
    │ 2   ┆ 1     ┆ 3   ┆ null      │
    │ 5   ┆ 2     ┆ 3   ┆ 2         │
    └─────┴───────┴─────┴───────────┘
    

    If there can be multiple matches, you may want to use .unique() to reduce the search space.

    df.unique(subset=["id", "count"], keep="last", maintain_order=True)