Search code examples
pythonpython-polars

Raise exception in .map_elements()


Update: This was fixed by pull/20417 in Polars 1.18.0


I'm using .map_elements to apply a complex Python function to every element of a polars series. This is a toy example:

import polars as pl

df = pl.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})

def sum_cols(row):
    return row["A"] + row["B"]

df.with_columns(
    pl.struct(pl.all())
    .map_elements(sum_cols, return_dtype=pl.Int32).alias("summed")
)

shape: (3, 3)
┌─────┬─────┬────────┐
│ A   ┆ B   ┆ summed │
│ --- ┆ --- ┆ ---    │
│ i64 ┆ i64 ┆ i32    │
╞═════╪═════╪════════╡
│ 1   ┆ 4   ┆ 5      │
│ 2   ┆ 5   ┆ 7      │
│ 3   ┆ 6   ┆ 9      │
└─────┴─────┴────────┘

However, when my function raises an exception, Polars silently uses Nulls as the output of the computation:

def sum_cols(row):
    raise Exception
    return row["A"] + row["B"]

df.with_columns(
    pl.struct(pl.all())
    .map_elements(sum_cols, return_dtype=pl.Int32).alias("summed")
)

shape: (3, 3)
┌─────┬─────┬────────┐
│ A   ┆ B   ┆ summed │
│ --- ┆ --- ┆ ---    │
│ i64 ┆ i64 ┆ i32    │
╞═════╪═════╪════════╡
│ 1   ┆ 4   ┆ null   │
│ 2   ┆ 5   ┆ null   │
│ 3   ┆ 6   ┆ null   │
└─────┴─────┴────────┘

How can I make the Polars command fail when my function raises an exception?


Solution

  • I'm pretty sure this is a bug in Polars.

    As a workaround, you could use .map_batches() to pass the whole "column" instead:

    import polars as pl
    
    df = pl.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
    
    def sum_cols(col):
        raise Exception
        return pl.Series(row["A"] + row["B"] for row in col)
    
    df.with_columns(
        pl.struct(pl.all()).map_batches(sum_cols)
    )
    

    Which propagates exceptions as one would expect.

    # ComputeError: Exception: