Update: This issue is no longer present in Polars, and my function runs without error.
I want to use some custom function in rolling_map in polars. However, I met TypeError when doing below.
def ts_rank(expr: pl.Expr, window: int) -> pl.Expr:
res = expr.cast(pl.Float64).rolling_map(
lambda s: s.rank(method='average', descending=False)[-1]/s.is_not_null().sum(),
window_size = window,
min_periods = window//2).over('a')
return res
df = pl.DataFrame({"a": [1, 1, 1, 1, 2, 2, 2, 2],
"b": [None, None, None, 1, 4, 2, 3, 8]})
df.with_columns(ts_rank(pl.col('b'),4).alias('rank'))
I got this error:
PanicException: python function failed: PyErr { type: , value: TypeError("unsupported operand type(s) for /: 'NoneType' and 'int'"), traceback: Some() }
Is this a correct 'polars' way to do rolling_rank? (For my own purpose, I have to write it as an Expr, not using DataFrame.rolling
)
Directly using a None
or any other non-real type won't work in this case, however using a pl.Series
with a dtype of pl.Float64
will work.
You can wrap the needed None
in a new pl.Series
.
pl.Series(values=[None], dtype=pl.Float64)
Here it is applied in your ts_rank
function.
import polars as pl
def ts_rank(expr: pl.Expr, window: int) -> pl.Expr:
def rank(s):
tmp = s.rank(method="average", descending=False)[-1]
if not tmp:
return pl.Series(values=[None], dtype=pl.Float64)
return tmp / s.is_not_null().sum()
res = (
expr.cast(pl.Float64)
.rolling_map(rank, window_size=window, min_periods=window // 2)
.over("a")
)
return res