Search code examples
rustrust-polars

How to construct Array Series without going through List?


I can do this:

let mut col1: ListPrimitiveChunkedBuilder<Float64Type> = ListPrimitiveChunkedBuilder::new(
    "array", 
    1,
    2, 
    DataType::Float64);

col1.append_slice(&[1.1,2.2]);
col1.append_slice(&[2.1,2.2]);
col1.append_slice(&[3.1,2.2]);
let s = col1.finish().into_series();
let sa = s.cast(&DataType::Array(Box::new(DataType::Float64), 2)).unwrap();
let df = DataFrame::new(vec![sa]);
println!("{:?}",df)
shape: (3, 1)
┌───────────────┐
│ array         │
│ ---           │
│ array[f64, 2] │
╞═══════════════╡
│ [1.1, 2.2]    │
│ [2.1, 2.2]    │
│ [3.1, 2.2]    │
└───────────────┘

but it seems silly to make a list just to later cast it to an Array. I found FixedSizeListNumericBuilder in polars-core/src/chunked_array_builder/fixed_size_list.rs but I don't know how to use it.

A couple bonus questions:

  1. in the ListPrimitiveChunkedBuilder what are capacity and values_capacity? They don't seem to be binding. (ie I can make more horizontal and vertical values than are specified there.)

  2. Why do I have to wrap DataType::Float64 in Box::new()?


Solution

  • FixedSizeListNumericBuilder is not exported (pub(crate)), so you cannot use it. Its methods are also mostly unsafe, so it looks like it wasn't meant to be exposed either.

    in the ListPrimitiveChunkedBuilder what are capacity and values_capacity? They don't seem to be binding. (ie I can make more horizontal and vertical values than are specified there.)

    Capacity in Rust usually refers to the amount of allocated memory. It's not a hard limit since allocations can grow, but it usually affects performance. It is a good idea to predict the desired capacity if you can.

    Why do I have to wrap DataType::Float64 in Box::new()?

    Because DataType::Array recursively contains a DataType (for the elements type), and recursive types must include an indirection, otherwise they would have an infinite size.