I have some expressions that I will evaluate later either within or without a window function. This normally works fine. Have pl.col("x").max()
—add .over("y")
later. Have pl.arange(0, pl.count())
—add .over("y")
later. One expression this does not work on is pl.count()
.
If you try to window pl.count()
, Polars errors:
import polars as pl
df = pl.DataFrame(dict(x=[1,1,0,0], y=[1,2,3,4]))
expression = pl.count()
df.with_columns([expression.over("x").alias("z")])
# exceptions.ComputeError: Cannot apply a window function, did not find a root column. This is likely due to a syntax error in this expression: count()
Is there a version of count
that can handle being windowed? I know that I can do pl.col("x").count().over("x")
, but then I have to know ahead of time what columns will exist, and the expressions and the window columns come from completely different parts of my code.
Upgrade to Polars >=0.14. Starting in that release, the behavior in the original question started working without modification.
import polars as pl
df = pl.DataFrame(dict(x=[1,1,0,0], y=[1,2,3,4]))
expression = pl.count()
df.with_columns([expression.over("x").alias("z")])
# shape: (4, 3)
# ┌─────┬─────┬─────┐
# │ x ┆ y ┆ z │
# │ --- ┆ --- ┆ --- │
# │ i64 ┆ i64 ┆ u32 │
# ╞═════╪═════╪═════╡
# │ 1 ┆ 1 ┆ 2 │
# │ 1 ┆ 2 ┆ 2 │
# │ 0 ┆ 3 ┆ 2 │
# │ 0 ┆ 4 ┆ 2 │
# └─────┴─────┴─────┘