Search code examples
rustrust-polars

The reason why to use map twice for applying polars Rust


let o = GetOutput::from_type(DataType::UInt32);
// this adds new column len, two is unchanged
let lf = lf.with_column(col("two").alias("len").apply(str_to_len, o));
fn str_to_len(str_val: Series) -> Result<Series> {
    let x = str_val
        .utf8()
        .unwrap()
        .into_iter()
        // your actual custom function would be in this map
        .map(|opt_name: Option<&str>| opt_name.map(|name: &str| name.len() as u32))
        .collect::<UInt32Chunked>();
    Ok(x.into_series())
}

This function works well.

fn str_to_len(str_val: Series) -> Result<Series> {
    let x = str_val
        .utf8()
        .unwrap()
        .into_iter()
        // your actual custom function would be in this map
        .map(|opt_name: Option<&str>| opt_name.unwrap().len() as u32) // <-- changed part
        .collect::<UInt32Chunked>();
    Ok(x.into_series())
}

But if I change the map part, like the above, the code is not available anymore.

And the rust analyzer said:

error[E0277]: a value of type `ChunkedArray<UInt32Type>` cannot be built from an iterator over elements of type `u32`
    --> src/frag_feature_refiner.rs:70:10
     |
70   |         .collect::<UInt32Chunked>();
     |          ^^^^^^^ value of type `ChunkedArray<UInt32Type>` cannot be built from `std::iter::Iterator<Item=u32>`
     |
     = help: the trait `FromIterator<u32>` is not implemented for `ChunkedArray<UInt32Type>`
     = help: the following other types implement trait `FromIterator<A>`:
               <ChunkedArray<BinaryType> as FromIterator<Option<Ptr>>>
               <ChunkedArray<BinaryType> as FromIterator<Ptr>>
               <ChunkedArray<BooleanType> as FromIterator<Option<bool>>>
               <ChunkedArray<BooleanType> as FromIterator<bool>>
               <ChunkedArray<ListType> as FromIterator<Option<Box<(dyn polars::export::arrow2::array::Array + 'static)>>>>
               <ChunkedArray<ListType> as FromIterator<Option<polars::prelude::Series>>>
               <ChunkedArray<ListType> as FromIterator<Ptr>>
               <ChunkedArray<T> as FromIterator<(Vec<<T as PolarsNumericType>::Native>, Option<Bitmap>)>>
             and 3 others
note: the method call chain might not have had the expected associated types
    --> src/frag_feature_refiner.rs:69:10
     |
64   |     let x = str_val
     |             ------- this expression has type `Series`
...
67   |         .into_iter()
     |          ----------- `Iterator::Item` is `Option<&str>` here
68   |         // your actual custom function would be in this map
69   |         .map(|opt_name: Option<&str>| opt_name.unwrap().len() as u32) // <-- changed part
     |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `Iterator::Item` changed to `u32` here
note: required by a bound in `std::iter::Iterator::collect`
    --> /imbdx/user/eck/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/traits/iterator.rs:1892:19
     |
1892 |     fn collect<B: FromIterator<Self::Item>>(self) -> B
     |                   ^^^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `Iterator::collect`

In this case, why I need to use map twice in the str_to_len function?


Solution

  • As the error states, ChunkedArray does not implement FromIterator for the u32 type. FromIterator is, however, implemented for iterators whose elements are Option<u32> (see the documentation).

    Calling .map on an Option produces another Option, but unwrap extracts the value inside, so the iterator elements are no longer of type Option, causing .collect to fail.