Say I want to do the following in Polars:
df.with_columns(
a_1 = pl.col('a').shift(1),
a_2 = pl.col('a').shift(2),
b_1 = pl.col('b').shift(1),
b_2 = pl.col('b').shift(2),
)
Say, starting from
import polars as pl
df = pl.DataFrame({'a': [1,3,2,4], 'b': [5,1,2,1]})
So, desired output:
shape: (4, 6)
┌─────┬─────┬──────┬──────┬──────┬──────┐
│ a ┆ b ┆ a_1 ┆ a_2 ┆ b_1 ┆ b_2 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪══════╪══════╪══════╪══════╡
│ 1 ┆ 5 ┆ null ┆ null ┆ null ┆ null │
│ 3 ┆ 1 ┆ 1 ┆ null ┆ 5 ┆ null │
│ 2 ┆ 2 ┆ 3 ┆ 1 ┆ 1 ┆ 5 │
│ 4 ┆ 1 ┆ 2 ┆ 3 ┆ 2 ┆ 1 │
└─────┴─────┴──────┴──────┴──────┴──────┘
How can I do this with ibis?
I need to do this starting from the Polars dataframe df
, so I need to connect to that using Ibis. Looking at the docs doesn't give much away in terms of what I'm actually supposed to do
Is it correct to do:
t = ibis.memtable(df)
t = t.mutate(
ibis._['a'].lead(1),
ibis._['a'].lead(2),
ibis._['b'].lead(1),
ibis._['b'].lead(2),
)
t.to_polars()
you can use lag()
window function:
t.mutate(
a_1 = t.a.lag(1),
a_2 = t.a.lag(2),
b_1 = t.b.lag(1),
b_2 = t.b.lag(2)
).to_polars()
shape: (4, 6)
┌─────┬─────┬──────┬──────┬──────┬──────┐
│ a ┆ b ┆ a_1 ┆ a_2 ┆ b_1 ┆ b_2 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪══════╪══════╪══════╪══════╡
│ 1 ┆ 5 ┆ null ┆ null ┆ null ┆ null │
│ 3 ┆ 1 ┆ 1 ┆ null ┆ 5 ┆ null │
│ 2 ┆ 2 ┆ 3 ┆ 1 ┆ 1 ┆ 5 │
│ 4 ┆ 1 ┆ 2 ┆ 3 ┆ 2 ┆ 1 │
└─────┴─────┴──────┴──────┴──────┴──────┘