Search code examples
pythonpython-polars

How can I use when, then and otherwise with multiple conditions in polars?


I have a data set with three columns. Column A is to be checked for strings. If the string matches foo or spam, the values in the same row for the other two columns L and G should be changed to XX. For this I have tried the following.

df = pl.DataFrame(
    {
        "A": ["foo", "ham", "spam", "egg",],
        "L": ["A54", "A12", "B84", "C12"],
        "G": ["X34", "C84", "G96", "L6",],
    }
)
print(df)

shape: (4, 3)
┌──────┬─────┬─────┐
│ A    ┆ L   ┆ G   │
│ ---  ┆ --- ┆ --- │
│ str  ┆ str ┆ str │
╞══════╪═════╪═════╡
│ foo  ┆ A54 ┆ X34 │
│ ham  ┆ A12 ┆ C84 │
│ spam ┆ B84 ┆ G96 │
│ egg  ┆ C12 ┆ L6  │
└──────┴─────┴─────┘

expected outcome

shape: (4, 3)
┌──────┬─────┬─────┐
│ A    ┆ L   ┆ G   │
│ ---  ┆ --- ┆ --- │
│ str  ┆ str ┆ str │
╞══════╪═════╪═════╡
│ foo  ┆ XX  ┆ XX  │
│ ham  ┆ A12 ┆ C84 │
│ spam ┆ XX  ┆ XX  │
│ egg  ┆ C12 ┆ L6  │
└──────┴─────┴─────┘

I tried this

df = df.with_column(
    pl.when((pl.col("A") == "foo") | (pl.col("A") == "spam"))
    .then((pl.col("L")= "XX") & (pl.col( "G")= "XX"))
    .otherwise((pl.col("L"))&(pl.col( "G")))
)

However, this does not work. Can someone help me with this?


Solution

  • For setting multiple columns to the same value you could use:

    df.with_columns(
       pl.when(pl.col("A").is_in(["foo", "spam"]))
         .then(pl.lit("XX"))
         .otherwise(pl.col("L", "G"))
         .name.keep()
    )
    
    shape: (4, 3)
    ┌──────┬─────┬─────┐
    │ A    ┆ L   ┆ G   │
    │ ---  ┆ --- ┆ --- │
    │ str  ┆ str ┆ str │
    ╞══════╪═════╪═════╡
    │ foo  ┆ XX  ┆ XX  │
    │ ham  ┆ A12 ┆ C84 │
    │ spam ┆ XX  ┆ XX  │
    │ egg  ┆ C12 ┆ L6  │
    └──────┴─────┴─────┘
    
    • .is_in() can be used instead of multiple == x | == y chains

    If you need different values, you can pack them into a struct and extract the fields.

    df.with_columns(
       pl.when(pl.col("A").is_in(["foo", "spam"]))
         .then(pl.struct(L = pl.lit("AAA"), G = pl.lit("BBB")))
         .otherwise(pl.struct("L", "G"))
         .struct.field("L", "G")
    )
    
    shape: (4, 3)
    ┌──────┬─────┬─────┐
    │ A    ┆ L   ┆ G   │
    │ ---  ┆ --- ┆ --- │
    │ str  ┆ str ┆ str │
    ╞══════╪═════╪═════╡
    │ foo  ┆ AAA ┆ BBB │
    │ ham  ┆ A12 ┆ C84 │
    │ spam ┆ AAA ┆ BBB │
    │ egg  ┆ C12 ┆ L6  │
    └──────┴─────┴─────┘