Search code examples
python-polarsrust-polarsnodejs-polars

window agg over one value, but return another via Polars


I am trying to use polars to do a window aggregate over one value, but map it back to another.

For example, if i wanted to get the name of the max value in a group, instead of (or in combination to) just the max value.

assuming an input of something like this.

|label|name|value|
|a.  | foo| 1.   |
|a.  | bar| 2.   |
|b.  | baz| 1.5. |
|b.  | boo| -1   |
# 'max_by' is not a real method, just using it to express what i'm trying to achieve. 
df.select(col('label'), col('name').max_by('value').over('label'))

i want an output like this

|label|name|
|a.   | bar|
|b.   | baz|

ideally with the value too. But i know i can easily add that in via col('value').max().over('label').

|label|name|value|
|a.   | bar| 2.  |
|b.   | baz| 1.5.|

Solution

  • You were close. There is a sort_by expression that can be used.

    df.groupby('label').agg(pl.all().sort_by('value').last())
    
    shape: (2, 3)
    ┌───────┬──────┬───────┐
    │ label ┆ name ┆ value │
    │ ---   ┆ ---  ┆ ---   │
    │ str   ┆ str  ┆ f64   │
    ╞═══════╪══════╪═══════╡
    │ a.    ┆ bar  ┆ 2.0   │
    ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
    │ b.    ┆ baz  ┆ 1.5   │
    └───────┴──────┴───────┘
    

    If you need a windowed version of this:

    df.with_columns([
        pl.col(['name','value']).sort_by('value').last().over('label').suffix("_max")
    ])
    
    shape: (4, 5)
    ┌───────┬──────┬───────┬──────────┬───────────┐
    │ label ┆ name ┆ value ┆ name_max ┆ value_max │
    │ ---   ┆ ---  ┆ ---   ┆ ---      ┆ ---       │
    │ str   ┆ str  ┆ f64   ┆ str      ┆ f64       │
    ╞═══════╪══════╪═══════╪══════════╪═══════════╡
    │ a.    ┆ foo  ┆ 1.0   ┆ bar      ┆ 2.0       │
    ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
    │ a.    ┆ bar  ┆ 2.0   ┆ bar      ┆ 2.0       │
    ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
    │ b.    ┆ baz  ┆ 1.5   ┆ baz      ┆ 1.5       │
    ├╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
    │ b.    ┆ boo  ┆ -1.0  ┆ baz      ┆ 1.5       │
    └───────┴──────┴───────┴──────────┴───────────┘