Search code examples

Explode multiple columns with different lengths

I have a dataframe like:

data = {
    "a": [[1], [2], [3, 4], [5, 6, 7]],
    "b": [[], [8], [9, 10], [11, 12]],
df = pl.DataFrame(data)
│ a         ┆ b         │
│ ---       ┆ ---       │
│ list[i64] ┆ list[i64] │
│ [1]       ┆ []        │
│ [2]       ┆ [8]       │
│ [3, 4]    ┆ [9, 10]   │
│ [5, 6, 7] ┆ [11, 12]  │

Each pair of lists may not have the same length, and I want to "truncate" the explode to the shortest of both lists:

│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
│ 2   ┆ 8   │
│ 3   ┆ 9   │
│ 4   ┆ 10  │
│ 5   ┆ 11  │
│ 6   ┆ 12  │

I was thinking that maybe I'd have to fill the shortest of both lists with None to match both lengths, and then drop_nulls. But I was wondering if there was a more direct approach to this?


  • Here's one approach:

    min_length = pl.min_horizontal(pl.col('a', 'b').list.len())
    out = (df.filter(min_length != 0)
               pl.col('a', 'b').list.head(min_length)
           .explode('a', 'b')


    shape: (5, 2)
    │ a   ┆ b   │
    │ --- ┆ --- │
    │ i64 ┆ i64 │
    │ 2   ┆ 8   │
    │ 3   ┆ 9   │
    │ 4   ┆ 10  │
    │ 5   ┆ 11  │
    │ 6   ┆ 12  │
