Search code examples
python-polarsarray-reverse

Polars: How to find value in previous values by predicate?


How to find "price" of previous 'top'?

df = DataFrame({
    'price': [1, 2, 4, 2, 1, 2, 3, 2],
    'spikes': [None, None, 'top', None, 'bottom', None, 'top', None]
})

In result I want to find value of previous top/bottom price. Expected result would be:

df = DataFrame({
    'price': [1, 2, 4, 2, 1, 2, 3, 2],
    'spikes': [None, None, 'top', None, 'bottom', None, 'top', None],
    'prev_spikes_prices: [None, None, None, None, None, None, 4, None]
})

Solution

  • So this can be solved by grouping by spikes and shifting the price column by one and since you only want the prices where spikes is not null we add this as pl.when condition. See the following code:

    (df
     .with_columns(
        pl.when(pl.col("spikes").is_not_null())
        .then(pl.col("price").shift(1).over("spikes"))
        .otherwise(None).alias("prev_spikes_prices"))
    )
    
    
    
    shape: (8, 3)
    ┌───────┬────────┬────────────────────┐
    │ price ┆ spikes ┆ prev_spikes_prices │
    │ ---   ┆ ---    ┆ ---                │
    │ i64   ┆ str    ┆ i64                │
    ╞═══════╪════════╪════════════════════╡
    │ 1     ┆ null   ┆ null               │
    │ 2     ┆ null   ┆ null               │
    │ 4     ┆ top    ┆ null               │
    │ 2     ┆ null   ┆ null               │
    │ 1     ┆ bottom ┆ null               │
    │ 2     ┆ null   ┆ null               │
    │ 3     ┆ top    ┆ 4                  │
    │ 2     ┆ null   ┆ null               │
    └───────┴────────┴────────────────────┘