Search code examples
pythonpython-polars

How to filter a polars dataframe by date?


df.filter(pl.col("MyDate") >= "2020-01-01")

does not work like it does in pandas.

I found a workaround

df.filter(pl.col("MyDate") >= pl.datetime(2020,1,1))

but this does not solve a problem if I need to use string variables.


Solution

  • You can turn the string into a date type e.g. with .str.to_date()

    Building on the example above:

    import polars as pl
    from datetime import datetime
    
    df = pl.DataFrame({
        "dates": [datetime(2021, 1, 1), datetime(2021, 1, 2), datetime(2021, 1, 3)],
        "vals": range(3)
    })
    
    df.filter(pl.col('dates') >= pl.lit(my_date_str).str.to_date())
    
    shape: (2, 2)
    ┌─────────────────────┬──────┐
    │ dates               ┆ vals │
    │ ---                 ┆ ---  │
    │ datetime[μs]        ┆ i64  │
    ╞═════════════════════╪══════╡
    │ 2021-01-02 00:00:00 ┆ 1    │
    │ 2021-01-03 00:00:00 ┆ 2    │
    └─────────────────────┴──────┘