I can't figure out how to build a simple polars filter command dynamically from kwargs.
With panda's it seems quite straight forward:
df.query(" and ".join(f"{key} == {repr(value)}" for key, value in kwargs.items()))
[Update]: kwargs support has been added to .filter()
which will be available from Polars 0.19.9
This means the .all_horizontal()
example below can be simplified to:
df = pl.DataFrame({"A": [1, 2, 3, 2, 1], "B": [4, 5, 6, 7, 8]})
kwargs = {"A": 2, "B": 5}
df.filter(**kwargs)
shape: (1, 2)
┌─────┬─────┐
│ A ┆ B │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 2 ┆ 5 │
└─────┴─────┘
Polars expressions are actually objects and can be described as "lazy".
>>> pl.col('foo') == pl.lit('bar')
<polars.expr.expr.Expr at 0x1273a0880>
>>> pl.when(pl.col('foo').is_in([1, 2, 3]))
<polars.expr.whenthen.When at 0x12a37f6a0>
They don't actually do anything until they are evaluated by Polars itself, e.g. inside df.with_columns()
This means you can just build your expression directly:
df = pl.DataFrame({"A": [1, 2, 3, 2, 1], "B": [4, 5, 6, 7, 8]})
kwargs = {"A": 2, "B": 5}
df.with_columns(query =
pl.concat_list(
pl.col(key) == pl.lit(value) for key, value in kwargs.items()
)
)
shape: (5, 3)
┌─────┬─────┬────────────────┐
│ A ┆ B ┆ query │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ list[bool] │
╞═════╪═════╪════════════════╡
│ 1 ┆ 4 ┆ [false, false] │
│ 2 ┆ 5 ┆ [true, true] │
│ 3 ┆ 6 ┆ [false, false] │
│ 2 ┆ 7 ┆ [true, false] │
│ 1 ┆ 8 ┆ [false, false] │
└─────┴─────┴────────────────┘
If you wanted to filter rows based on this, you can use
.all_horizontal()
to AND the booleans:
(.any_horizontal()
for OR)
df.filter(
pl.all_horizontal(
pl.col(key) == pl.lit(value) for key, value in kwargs.items()
)
)
shape: (1, 2)
┌─────┬─────┐
│ A ┆ B │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 2 ┆ 5 │
└─────┴─────┘
pl.sql_expr
/ SQLContext
may also be of interest: Pandas.eval replacement in polars