Search code examples
multithreadingrustconcurrencymutex

Concurent Threading in Rust: What does "join on handle" perform?


In the following code from Rust documentation it talks about concurrent threading in Rust.

    use std::sync::{Arc, Mutex};
    use std::thread;
    
    fn main() {
        let counter = Arc::new(Mutex::new(0));
        let mut handles = vec![];
    
        for _ in 0..10 {
            let counter = Arc::clone(&counter);
            let handle = thread::spawn(move || {
                let mut num = counter.lock().unwrap();
    
                *num += 1;
            });
            handles.push(handle);
        }
    
        for handle in handles {
            handle.join().unwrap();
        }
    
        println!("Result: {}", *counter.lock().unwrap());
    }

I still couldn't grasp the idea of the for loop for the handles

i.e

for handle in handles {
        handle.join().unwrap();
    }

The documentation says "we call join on each handle to make sure all the threads finish. " For an experiment I commented out the handle for loop and I got an out put of 8 instead of 10. When I changed the loop to 1000, I got 999 when the handle loop is commented. What is happening here ? How does 8 & 999 become the output ?

EDIT: I found this documentation to touch on handle and general concept of threading.

  [1]: https://doc.rust-lang.org/book/ch16-03-shared-state.html

Solution

  • I still couldn't grasp the idea of the for loop for the handles

    What is there to grasp? JoinHandle::join blocks until the corresponding thread is done executing (so the function has reached its end), that's most of what there is to it (usefully it also yields whatever the thread's function has returned).

    When I changed the loop to 1000, I got 999 when the handle loop is commented. What is happening here ? How does 8 & 999 become the output ?

    When you don't join on the threads, you have a race between the threads and the main thread (the main function). The value you get is however many threads have executed the increment in the time it took to

    1. create all the threads
    2. add each threads to the vector
    3. get a lock on the counter

    This will change depending on system load and OS scheduling details, though most of the delay of the main thread will be... spawning more threads (compared to spawning a thread, acquiring a lock and incrementing a number is cheap) which is why most of the threads are done by the time you print the results. If you increase per-thread work, or change the way the threads are spawned, you will see different races.