Rayon looks great for algorithm parallelization of collections, and Faster is great for vectorization (SIMD) on the x86 platform for collections like Vec<f32>
. I've tried to combine them and the iterators don't seem to like each other. Is there a way to make use of these two libraries for algorithms which would benefit from both vectorization and parallelization? Like this one from the Faster example:
let lots_of_3s = (&[-123.456f32; 128][..]).iter()
.map(|v| {
9.0 * v.abs().sqrt().sqrt().recip().ceil().sqrt() - 4.0 - 2.0
})
.collect::<Vec<f32>>();
You can just use Rayon’s par_chunks
and process each chunk with Faster.
let lots_of_3s = (&[-123.456f32; 1000000][..])
.par_chunks(128)
.flat_map(|chunk| {
chunk
.simd_iter(f32s(0.0))
.simd_map(|v| {
f32s(9.0) * v.abs().sqrt().rsqrt().ceil().sqrt() - f32s(4.0) - f32s(2.0)
})
.scalar_collect()
})
.collect::<Vec<f32>>();