Search code examples
rustrayon

Parallel sampling from a random distribution in Rust?


I need to generate huge random vectors of f32s multiple times in my program, so I am looking for ways to make it parallel and efficient. I have made some progress with using Rayon's into_par_iter but I haven't found a way around having to initialize a new rng variable during the mapping.

Here is what I have currently:

    let r_dist = Uniform::new(0., 10.);

    let rand_vec: Vec<f32> = (1..biiiig_u64)
        .into_par_iter()
        .map(|_| {
            let mut rng = rand::thread_rng();
            rng.sample(r_dist)})
        .collect();

Of course this is making full use of all cpu cores, but I feel like initializing the new mut rng inside the mapping function is inefficient (I am new so I might be wrong). Is it possible to initialize an rng outside the iterator and use it non-unsafe-ly? Thanks.


Solution

  • thread_rng is designed specifially for using effeciently in multiple threads. From docs:

    Retrieve the lazily-initialized thread-local random number generator, seeded by the system.

    So it is created once per thread and stored in thread local variable. It should be quite fast already.

    However, rayon have a method exactly for your use-case: map_init.

        let r_dist = Uniform::new(0., 10.);
    
        let rand_vec: Vec<f32> = (1..biiiig_u64)
            .into_par_iter()
            .map_init(rand::thread_rng, |rng, _| rng.sample(r_dist))
            .collect();