Search code examples
rustrust-polars

Creating dataframe with missing values


I am experienced in Polars' Python package and just starting to use the Rust crate.

In a function I want to return a DataFrame that almost certainly has columns with missing values. My current approach is to create vectors with sentinel values as a starting point for a DataFrame and then I hope to replace those values with nulls. But I'm not having much success.

I can create the vectors and DataFrame with something like this

let mut a_vec: Vec<i64> = Vec::with_capacity(10);

for i in 0..10 {
    if <condition> {
        a_vec[i] = 1;
    } else {
        a_vec[i] = std::i64::MAX
    }
}

let mut df: DataFrame = df!("a" => a_vec).unwrap();

Now I want to replace std::i64::MAX with null.

In Python Polars I can run use the replace method, but I haven't found a (good) way to this in Rust.

If there is a better way to do this where I can avoid the sentinel values I'm all ears.


Solution

  • The proper way is to create a vector of Options:

    let mut a_vec: Vec<Option<i64>> = Vec::with_capacity(10);
    
    for i in 0..10 {
        if i != 5 {
            a_vec.push(Some(1));
        } else {
            a_vec.push(None);
        }
    }
    
    let mut df: DataFrame = df!("a" => a_vec).unwrap();
    

    You can also use chunked array builders, it should be more efficient:

    let mut a_vec: PrimitiveChunkedBuilder<Int64Type> = PrimitiveChunkedBuilder::new("a", 10);
    
    for i in 0..10 {
        if i != 5 {
            a_vec.append_value(1);
        } else {
            a_vec.append_null();
        }
    }
    
    let a_vec = a_vec.finish();
    
    let mut df: DataFrame = df!("a" => a_vec).unwrap();