Search code examples
pythonpython-3.xpython-polars

Pythonic way to update a column of a Polars data frame based on matching condition from another column


In Polars, what is an one-liner way to update items of a column based on matching condition from another column, maybe by applying lambda?

For example, I would like to multiply items in col1 with 1000 if items in col2 are equal to 'a'. Here's a crude way.

import polars as pl

df = pl.DataFrame({
                    'col1':[1,2,3,4],
                    'col2':['a', 'a', 'b', 'b'], 
                    'col3':[10.9, 12.0, 33.3, 34.4]
                    })

y_updated = []
for i in range(df.shape[0]):
    row = df[i]
    if row['col2'][0]=='a':
        y_updated.append(row['col1'][0]*1e3)
    else:
        y_updated.append(row['col1'][0])

df = df.with_columns(pl.Series(y_updated).alias('col1'))
print(df)

Outputs -

shape: (4, 3)
┌────────┬──────┬──────┐
│ col1   ┆ col2 ┆ col3 │
│ ---    ┆ ---  ┆ ---  │
│ f64    ┆ str  ┆ f64  │
╞════════╪══════╪══════╡
│ 1000.0 ┆ a    ┆ 10.9 │
│ 2000.0 ┆ a    ┆ 12.0 │
│ 3.0    ┆ b    ┆ 33.3 │
│ 4.0    ┆ b    ┆ 34.4 │
└────────┴──────┴──────┘

Solution

  • You can update values conditionally using polars.when

    df.with_columns(
        pl.when(pl.col("col2") == "a")
        .then(pl.col("col1") * 1000)
        .otherwise(pl.col("col1"))
    )
    
    # Output
    
    # shape: (4, 3)
    # ┌──────┬──────┬──────┐
    # │ col1 ┆ col2 ┆ col3 │
    # │ ---  ┆ ---  ┆ ---  │
    # │ i64  ┆ str  ┆ f64  │
    # ╞══════╪══════╪══════╡
    # │ 1000 ┆ a    ┆ 10.9 │
    # │ 2000 ┆ a    ┆ 12.0 │
    # │ 3    ┆ b    ┆ 33.3 │
    # │ 4    ┆ b    ┆ 34.4 │
    # └──────┴──────┴──────┘