Search code examples
pythonpython-polars

Calling polars apply results in an error message


This:

df = polars.DataFrame(dict(
  j=numpy.random.randint(10, 99, 10),
  k=numpy.random.randint(10, 99, 10),
  ))

def f(cell):
  # Simulate logic that cannot be done in polars alone
  # E.g. call external REST service that returns "success" => True
  return cell > 50

print(df.select(polars.all().apply(f)))

Causes this:

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: ComputeError(ErrString("wildcard has no root column name"))', /home/runner/work/polars/polars/crates/polars-plan/src/utils.rs:207:47
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

with polars 0.18.11, though it does finish and produces the expected result. What's the above about and how can I avoid it?


Solution

  • The error message is a bug in Polars. What triggers the error message is the use of the wildcard pl.all(). A more minimalistic example:

    import polars as pl
    
    df = pl.DataFrame({"a": [1, 2, 3]})
    
    result = df.select(pl.all().apply(lambda x: x > 50))
    # thread '<unnamed>' panicked ...
    

    Replacing pl.all() with pl.col("a") will not result in the error message.

    I have opened a bug report in the Polars repo to solve this issue. You can track it here.

    UPDATE: The issue has been resolved. Once a new Python release is issued (0.18.12), you can upgrade to the new version and the error message should no longer show.