Search code examples
python-polars

How to create fields dynamically


Is there any way to create fields dynamically?. I know there are some ways. But it will be better to know best approach in polars. For example I want to add 12 shifted columns to existing dataframe.(lag1, lag2, lag3...lagN) How to achieve this?

Thanks.


Solution

  • You can just use the python language for that. Polars expressions are lazily evaluated, so you can create them anywhere, in a for loop, a function, list comprehension, you name it.

    Below I give an example of dynamically created lag columns, one by calling a function, assigning to a variable and then using that variable. And one with a list comprehension.

    # some initial dataframe
    df = pl.DataFrame({
        "a": [1, 2, 3, 4, 5],
        "b": [5, 4, 3, 2, 1]
    })
    
    # a function that returns a lazy evaluated expression
    def lag(name: str, n: int) -> pl.Expr:
        return pl.col(name).shift(n).name.suffix(f"_lag_{n}")
    
    # a lazy evaluated expression assigned to a variable
    lag_foo = lag("a", 1)
    
    out = df.select([
        lag_foo,
    ] + [lag("b", i) for i in range(5)]  # create exprs with a list comprehension
    )
    
    print(out)
    

    This outputs:

    shape: (5, 6)
    ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
    │ a_lag_1 ┆ b_lag_0 ┆ b_lag_1 ┆ b_lag_2 ┆ b_lag_3 ┆ b_lag_4 │
    │ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     │
    │ i64     ┆ i64     ┆ i64     ┆ i64     ┆ i64     ┆ i64     │
    ╞═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
    │ null    ┆ 5       ┆ null    ┆ null    ┆ null    ┆ null    │
    │ 1       ┆ 4       ┆ 5       ┆ null    ┆ null    ┆ null    │
    │ 2       ┆ 3       ┆ 4       ┆ 5       ┆ null    ┆ null    │
    │ 3       ┆ 2       ┆ 3       ┆ 4       ┆ 5       ┆ null    │
    │ 4       ┆ 1       ┆ 2       ┆ 3       ┆ 4       ┆ 5       │
    └─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘