Search code examples
datedatetimerusttime-seriesrust-polars

Filter a polars dataframe by date in rust


Im trying to filter a dataframe by date. But filter it with expressions like this would be really cumbersome for a date like "2019-11-01 10:15:00".

My goal is to do something like the python version:

use polars::export::chrono::NaiveDateTime;
use polars::prelude::*;
fn main() -> Result<(), Box<dyn std::error::Error>> {
    let df = LazyCsvReader::new(path)
        .with_parse_dates(true)
        .has_header(true)
        .finish()?
        .collect()?;

    let dt = NaiveDateTime::parse_from_str("2019-11-01 10:15:00", "%Y-%m-%d %H:%M:%S")?;

    //This will not compile!
    let filtered = df.filter(col("time") < dt); 
}

However I'm having a really hard time to filter the dateframe in-place or just creating a boolean mask.


Solution

  • After more time than I dare to admit I finally solved it by using the eager API, there is probably a better solution in the Lazy-API but this works for now!

    use polars::export::chrono::NaiveDateTime;
    use polars::prelude::*;
    fn main() -> Result<(), Box<dyn std::error::Error>> {
    
        let df = LazyCsvReader::new(path)
            .with_parse_dates(true)
            .has_header(true)
            .finish()?
            .collect()?;
    
    // Set date to filter by
        let dt = NaiveDateTime::parse_from_str("2019-11-01 10:15:00", "%Y-%m-%d %H:%M:%S")?;
    
    // Create boolean mask
        let mask = df["time"]
            .datetime()?
            .as_datetime_iter()
            .map(|x| x.unwrap() < dt)
            .collect();
    
    // New filtered df
        let filtered_df = df.filter(&mask)?;
    }
    
    

    To get a date value from the "time" column and parse it to as a NaiveDateTime:

    fn main() -> Result<(), Box<dyn std::error::Error>> {
    
        // Lets take the last date from a series of datetime[µs]
        let date: Vec<Option<NaiveDateTime>> = df["time"]
            .tail(Some(1))
            .datetime()?
            .as_datetime_iter()
            .collect();
    
        // Create new NaiveDateTime, can be used as filter/condition in map-function
        let dt2 = NaiveDateTime::parse_from_str(&date[0].unwrap().to_string(), "%Y-%m-%d %H:%M:%S")?;
    }