Search code examples
pythonpython-polarsfeature-engineering

Cast column of type list[] to str in polars


Currently, using the polars' cast() method on columns of type list[] is not supported. It throws:

ComputeError: Cannot cast list type

Before I do as usual (use rows(), or convert to pandas, or work with apply()). Is there any trick or best practice to convert polars list[] to strings?

Here a quick snippet for you to replicate the error

df = pl.from_dict({'foo': [[1,2,3]], 'bar': 'Hello World'})
print(df)
'''
shape: (1, 2)
┌───────────┬─────────────┐
│ foo       ┆ bar         │
│ ---       ┆ ---         │
│ list[i64] ┆ str         │
╞═══════════╪═════════════╡
│ [1, 2, 3] ┆ Hello World │
└───────────┴─────────────┘
'''
df['foo'].cast(str)
# this other workaround wont work neither
df.select([pl.col('foo').str])

Here is what i expect to see:

'''
shape: (1, 2)
┌───────────┬─────────────┐
│ foo       ┆ bar         │
│ ---       ┆ ---         │
│ str       ┆ str         │
╞═══════════╪═════════════╡
│"[1, 2, 3]"┆ Hello World │
└───────────┴─────────────┘
'''

Solution

  • You can use the datatype pl.List(pl.String)

    df.with_columns(pl.col("foo").cast(pl.List(pl.String)))
    
    shape: (1, 2)
    ┌─────────────────┬─────────────┐
    │ foo             | bar         │
    │ ---             | ---         │
    │ list[str]       | str         │
    ╞═════════════════╪═════════════╡
    │ ["1", "2", "3"] | Hello World │
    └─────────────────┴─────────────┘
    

    To create an actual string - perhaps:

    df.with_columns("[" + 
       pl.col("foo").cast(pl.List(pl.String)).list.join(", ") 
       + "]"
    )
    

    There's also pl.format()

    df.with_columns(
       pl.format("[{}]",
          pl.col("foo").cast(pl.List(pl.String)).list.join(", ")))
    
    shape: (1, 3)
    ┌───────────┬─────────────┬───────────┐
    │ foo       | bar         | literal   │
    │ ---       | ---         | ---       │
    │ list[i64] | str         | str       │
    ╞═══════════╪═════════════╪═══════════╡
    │ [1, 2, 3] | Hello World | [1, 2, 3] │
    └───────────┴─────────────┴───────────┘