Is there any way to create fields dynamically?. I know there are some ways. But it will be better to know best approach in polars. For example I want to add 12 shifted columns to existing dataframe.(lag1, lag2, lag3...lagN) How to achieve this?
Thanks.
You can just use the python language for that. Polars expressions are lazily evaluated, so you can create them anywhere, in a for loop, a function, list comprehension, you name it.
Below I give an example of dynamically created lag
columns, one by calling a function, assigning to a variable and then using that variable. And one with a list comprehension.
# some initial dataframe
df = pl.DataFrame({
"a": [1, 2, 3, 4, 5],
"b": [5, 4, 3, 2, 1]
})
# a function that returns a lazy evaluated expression
def lag(name: str, n: int) -> pl.Expr:
return pl.col(name).shift(n).name.suffix(f"_lag_{n}")
# a lazy evaluated expression assigned to a variable
lag_foo = lag("a", 1)
out = df.select([
lag_foo,
] + [lag("b", i) for i in range(5)] # create exprs with a list comprehension
)
print(out)
This outputs:
shape: (5, 6)
┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│ a_lag_1 ┆ b_lag_0 ┆ b_lag_1 ┆ b_lag_2 ┆ b_lag_3 ┆ b_lag_4 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ null ┆ 5 ┆ null ┆ null ┆ null ┆ null │
│ 1 ┆ 4 ┆ 5 ┆ null ┆ null ┆ null │
│ 2 ┆ 3 ┆ 4 ┆ 5 ┆ null ┆ null │
│ 3 ┆ 2 ┆ 3 ┆ 4 ┆ 5 ┆ null │
│ 4 ┆ 1 ┆ 2 ┆ 3 ┆ 4 ┆ 5 │
└─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘