Search code examples
pythonpython-polars

Pure polars version of safe ast literal eval


I have data like this,

df = pl.DataFrame({'a': ["['b', 'c', 'd']"]})

I want to convert the string to a list I use,

df = df.with_columns(a=pl.col('a').str.json_decode())

it gives me,

ComputeError: error inferring JSON: InternalError(TapeError) at character 1 (''')

then I use this function,

import ast
def safe_literal_eval(val):
    try:
        return ast.literal_eval(val)
    except (ValueError, SyntaxError):
        return val
df = df.with_columns(a=pl.col('a').map_elements(safe_literal_eval, return_dtype=pl.List(pl.String)))

and get the expected output, but is there a pure polars way to achieve the same?


Solution

  • A general ast eval is not yet available. The problem with json_decode is that the list representation uses single quotes (instead of double quotes as used in JSON).

    In your example, this issue can be circumvented by replacing the single quotes using pl.Expr.str.replace_all as follows.

    df.with_columns(
        pl.col("a").str.replace_all("'", '"').str.json_decode()
    )
    
    shape: (1, 1)
    ┌─────────────────┐
    │ a               │
    │ ---             │
    │ list[str]       │
    ╞═════════════════╡
    │ ["b", "c", "d"] │
    └─────────────────┘