Is it possible to convert the following filter
, which uses two conditions, to something that uses expression expansion or a custom function in order to apply the DRY priciple (avoid the repetition)?
Here is the example:
import polars as pl
df = pl.DataFrame(
{
"a": [1, 2, 3, 4, 5],
"val1": [1, None, 0, 0, None],
"val2": [1, None, None, 0, 1],
}
)
df.filter((~pl.col("val1").is_in([None, 0])) | (~pl.col("val2").is_in([None, 0])))
Results in:
┌─────┬──────┬──────┐
│ a ┆ val1 ┆ val2 │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪══════╪══════╡
│ 1 ┆ 1 ┆ 1 │
│ 5 ┆ null ┆ 1 │
└─────┴──────┴──────┘
.any_horizontal()
and .all_horizontal()
can be used to build |
and &
chains.
.not_()
can also be used instead of ~
if you prefer.
df.filter(
pl.any_horizontal(
pl.col("val1", "val2").is_in([None, 0]).not_()
)
)
shape: (2, 3)
┌─────┬──────┬──────┐
│ a ┆ val1 ┆ val2 │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪══════╪══════╡
│ 1 ┆ 1 ┆ 1 │
│ 5 ┆ null ┆ 1 │
└─────┴──────┴──────┘