Search code examples
pythonpython-polars

Improving polars statement that adds a column applying a lambda function on each row


I am trying to add a column using map_rows in polars. The equivalent of pandas is as follows:

import pandas as pd
df = pd.DataFrame({"ref": [-1, 2, 8], "v1": [-1, 5, 0], "v2": [-1, 5, 8]})
df['count'] = df.apply(lambda r: len([i for i in r if i == r[0]]) - 1, axis=1)
df = df.drop('ref', axis=1)
df
   v1  v2  count
0  -1  -1      2
1   5   5      0
2   0   8      1

The following is the sample code that I have with polars. Though it works as desired, it looks ugly and probably can be improved as well.

import polars as pl

df = pl.DataFrame({"ref": [-1, 2, 8], "v1": [-1, 5, 0], "v2": [-1, 5, 8]})

x = df.map_rows(lambda r: len([i for i in r if i == r[0]]) - 1).rename({'map': 'count'})
df = df.hstack([x.to_series()]).drop('ref')

df
shape: (3, 3)
┌─────┬─────┬───────┐
│ v1  ┆ v2  ┆ count │
│ --- ┆ --- ┆ ---   │
│ i64 ┆ i64 ┆ i64   │
╞═════╪═════╪═══════╡
│ -1  ┆ -1  ┆ 2     │
│ 5   ┆ 5   ┆ 0     │
│ 0   ┆ 8   ┆ 1     │
└─────┴─────┴───────┘

What bothers me is the rename part and hstack that I clobbered together to work. I would be grateful for any improvements in the above code.

TIA


Solution

  • The idea is to use Polars Expressions instead of applying custom Python functions/lambdas.

    It looks like you're trying to count when ref and another column have the same value?

    df.select(pl.exclude("ref") == pl.col("ref"))
    
    shape: (3, 2)
    ┌───────┬───────┐
    │ v1    ┆ v2    │
    │ ---   ┆ ---   │
    │ bool  ┆ bool  │
    ╞═══════╪═══════╡
    │ true  ┆ true  │
    │ false ┆ false │
    │ false ┆ true  │
    └───────┴───────┘
    

    .sum_horizontal() can be used to get a "count" of the true values on each row.

    df.with_columns(count = pl.sum_horizontal(pl.exclude("ref") == pl.col("ref")))
    
    shape: (3, 4)
    ┌─────┬─────┬─────┬───────┐
    │ ref ┆ v1  ┆ v2  ┆ count │
    │ --- ┆ --- ┆ --- ┆ ---   │
    │ i64 ┆ i64 ┆ i64 ┆ u32   │
    ╞═════╪═════╪═════╪═══════╡
    │ -1  ┆ -1  ┆ -1  ┆ 2     │
    │ 2   ┆ 5   ┆ 5   ┆ 0     │
    │ 8   ┆ 0   ┆ 8   ┆ 1     │
    └─────┴─────┴─────┴───────┘