I am experimenting with a polars dataframe. The first column stores strings or null-values, the second numbers or null values. The rest are some columns with non-null data.
df = pl.from_repr("""
┌─────────┬─────────┬─────────┐
│ Column1 | Column2 | Column3 │
│ --- | --- | --- │
│ str | i64 | str │
╞═════════╪═════════╪═════════╡
│ foo | null | a │
│ null | null | b │
│ bar | 1 | c │
└─────────┴─────────┴─────────┘
""")
I try to replace the null values with a fixed value:
df = df.with_columns(pl.when(pl.col("Column1").is_null()).then(pl.lit("String"))
df = df.with_columns(pl.when(pl.col("Column2").is_null()).then(0))
But I get:
shape: (3, 4)
┌─────────┬─────────┬─────────┬─────────┐
│ Column1 ┆ Column2 ┆ Column3 ┆ literal │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str ┆ i32 │
╞═════════╪═════════╪═════════╪═════════╡
│ foo ┆ null ┆ a ┆ 0 │
│ null ┆ null ┆ b ┆ 0 │
│ bar ┆ 1 ┆ c ┆ null │
└─────────┴─────────┴─────────┴─────────┘
Instead of what I want:
shape: (3, 3)
┌─────────┬─────────┬─────────┐
│ Column1 ┆ Column2 ┆ Column3 │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞═════════╪═════════╪═════════╡
│ foo ┆ 0 ┆ a │
│ String ┆ 0 ┆ b │
│ bar ┆ 1 ┆ c │
└─────────┴─────────┴─────────┘
My original idea comes from the related post Conditional assignment in polars dataframe, but I do not see my mistake. What am I missing?
.fill_null()
can be used to replace the nulls directly.
df.with_columns(
pl.col("Column1").fill_null("String"),
pl.col("Column2").fill_null(0)
)
shape: (3, 3)
┌─────────┬─────────┬─────────┐
│ Column1 ┆ Column2 ┆ Column3 │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞═════════╪═════════╪═════════╡
│ foo ┆ 0 ┆ a │
│ String ┆ 0 ┆ b │
│ bar ┆ 1 ┆ c │
└─────────┴─────────┴─────────┘