Search code examples
pythonpython-polars

How do I flatten the elements of a column of type list of lists so that it is a column with elements of type list?


Consider the following example:

import polars as pl

pl.DataFrame(pl.Series("x", ["1, 0", "2,3", "5 4"])).with_columns(
    pl.col("x").str.split(",").list.eval(pl.element().str.split(" "))
)
shape: (3, 1)
┌────────────────────┐
│ x                  │
│ ---                │
│ list[list[str]]    │
╞════════════════════╡
│ [["1"], ["", "0"]] │
│ [["2"], ["3"]]     │
│ [["5", "4"]]       │
└────────────────────┘

I want to flatten the elements of the column, so instead of being a nested list, the elements are just a list. How do I do that?


Solution

  • You can use Expr.explode(), Expr.list.explode(), or Expr.flatten() to return one row for each list element, and using it inside of Expr.list.eval() lets you expand each row's nested lists instead of exploding the series itself.

    import polars as pl
    
    df = pl.DataFrame(pl.Series("x", ["1, 0", "2,3", "5 4"]))
    print(df.with_columns(
        pl.col("x")
        .str.split(",")
        .list.eval(pl.element().str.split(" "))
        .list.eval(pl.element().flatten())
    ))
    # You can also combine it with your existing .list.eval()
    print(df.with_columns(
        pl.col("x")
        .str.split(",")
        .list.eval(pl.element().str.split(" ").flatten())
    ))