How do I filter rows in which list contains null in polars DataFrame?

I want to read a csv file in which some column are string of this type: "[1.0, 0.0, -2.3]".

So I want to parse this strings as a list of float, to do so I use this code.

import polars as pl

df = pl.scan_csv(
    "foo.csv",
).with_columns(
    pl.col("^list.*$").str.strip("[ ]").str.split(",").cast(pl.List(pl.Float64)),
).collect()

Some of the strings are empty lists: "[]"; and when I parse them I get a list like this one: [null].

I would like to remove all the rows in which there is a list which contains an empty element.

This is the code I tried to use

df.filter(
    pl.all_horizontal(
        ~pl.col("^list.*$").list.contains(None)
    )
)

But it gives me this error

InvalidOperationError: `is_in` operation not supported for dtype `list[f64]`

I provide a little example file to test the code yourself:

list1,list2,list3
"[]","[0.0, 1.2]","[3.4]"
"[102, 506]","[12, 5.2]","[2.3]"
"[15]","[0.5, 8.2]","[]"

In this example I would like only the second row to be kept in the DataFrame.

Solution

I think you want to replace (or replace_all) the spaces as your current code loses 506

df = pl.read_csv(b"""
list1,list2,list3
"[]","[0.0, 1.2]","[3.4]"
"[102, 506]","[12, 5.2]","[2.3]"
"[15]","[0.5, 8.2]","[]"
""")

If you filter first you can avoid the dtype issue:

(df.filter(pl.all_horizontal(pl.all() != "[]"))
   .with_columns(
      pl.all().str.replace(" ", "")
        .str.strip_chars("[]")
        .str.split(",")
        .cast(pl.List(pl.Float64))
   )
)

shape: (1, 3)
┌────────────────┬─────────────┬───────────┐
│ list1          ┆ list2       ┆ list3     │
│ ---            ┆ ---         ┆ ---       │
│ list[f64]      ┆ list[f64]   ┆ list[f64] │
╞════════════════╪═════════════╪═══════════╡
│ [102.0, 506.0] ┆ [12.0, 5.2] ┆ [2.3]     │
└────────────────┴─────────────┴───────────┘