Search code examples
rustrust-polars

Rust Polars: How to get the row count of a DataFrame?


I want to filter a Polars DataFrame and then get the number of rows.

What I'm doing now seems to work but feels so wrong:

    let item_count = item_df
        .lazy()
        .filter(not(col("status").is_in(lit(filter))))
        .collect()?
        .shape().0;

In a subsequent DataFrame operation I need to use this in a division operation

           .with_column(
               col("count")
                   .div(lit(item_count as f64))
                   .mul(lit(100.0))
                   .alias("percentage"),
           );

This is for a tiny dataset (tens of rows) so I'm not worried about performance but I'd like to learn what the best way would be.


Solution

  • While there doesn't seem to be a predefined method on LazyFrame, you can use polars expressions:

    use polars::prelude::*;
    
    let df = df!["a" => [1, 2], "b" => [3, 4]].unwrap();
    dbg!(df.lazy().select([len()]).collect().unwrap());
    

    And to get the numeric value:

    df.lazy().select([len().alias("count")])
     .collect().unwrap()
     .column("count").unwrap()
     .u32().unwrap()
     .get(0).unwrap();