Search code examples
python-polars

Polars: Select element of list using column value as index


Code to generate the toy dataset:

import itertools
import numpy as np
import polars as pl

first = 30
second = 50
third = 40
data = {
    "a": np.concatenate(
        (np.repeat(1, first), np.repeat(2, second), np.repeat(3, third))
    ),
    "b": np.concatenate(
        (
            sorted(np.random.randint(1, first, size=first)),
            sorted(np.random.randint(1, second, size=second)),
            sorted(np.random.randint(1, third, size=third)),
        )
    ),
}

d = [
    np.tile(np.random.randint(1, first * 2, size=first), (first, 1)).tolist(),
    np.tile(np.random.randint(1, second * 2, size=second), (second, 1)).tolist(),
    np.tile(np.random.randint(1, third * 2, size=third), (third, 1)).tolist(),
]
data["d"] = list(itertools.chain.from_iterable(d))

df = pl.DataFrame(data)
pl_df = df.with_columns([pl.col("a").cum_count().over("a", "b").alias("c")])
pl_df.select(['a', 'b', 'c', "d"]).head()

enter image description here

I'm using polars "0.20.3" and I'd like to take the index value (from col B) of a list of values that are on another column, meaning that:

  • for first row, I would take number 23 from col D
  • for second row, I would take number 17 from col D
  • ...

How can I achieve that without iterating over the rows of the dataframe?

Thanks in advance


Solution

  • For this, pl.Expr.list.get can be used as follows.

    (
        df
        .with_columns(
            pl.col("d").list.get(pl.col("b")-1).alias("res")
        )
    )
    

    Note that I am passing pl.col("b") - 1 to the index parameter of pl.Expr.list.get to mimic the 1-indexed array operation mentioned in the question (the first list element being assigned index 1 instead of index 0).

    shape: (120, 4)
    ┌─────┬─────┬────────────────────┬─────┐
    │ a   ┆ b   ┆ d                  ┆ res │
    │ --- ┆ --- ┆ ---                ┆ --- │
    │ i64 ┆ i64 ┆ list[i64]          ┆ i64 │
    ╞═════╪═════╪════════════════════╪═════╡
    │ 1   ┆ 1   ┆ [31, 25, 45, … 59] ┆ 31  │
    │ 1   ┆ 3   ┆ [31, 25, 45, … 59] ┆ 45  │
    │ 1   ┆ 4   ┆ [31, 25, 45, … 59] ┆ 25  │
    │ 1   ┆ 8   ┆ [31, 25, 45, … 59] ┆ 24  │
    │ 1   ┆ 8   ┆ [31, 25, 45, … 59] ┆ 24  │
    │ …   ┆ …   ┆ …                  ┆ …   │
    └─────┴─────┴────────────────────┴─────┘