Does calling `into_inner()` on an atomic take into account all the relaxed writes?

Does into_inner() return all the relaxed writes in this example program? If so, which concept guarantees this?

extern crate crossbeam;

use std::sync::atomic::{AtomicUsize, Ordering};

fn main() {
    let thread_count = 10;
    let increments_per_thread = 100000;
    let i = AtomicUsize::new(0);

    crossbeam::scope(|scope| {
        for _ in 0..thread_count {
            scope.spawn(|| {
                for _ in 0..increments_per_thread {
                    i.fetch_add(1, Ordering::Relaxed);
                }
            });
        }
    });

    println!(
        "Result of {}*{} increments: {}",
        thread_count,
        increments_per_thread,
        i.into_inner()
    );
}

(https://play.rust-lang.org/?gist=96f49f8eb31a6788b970cf20ec94f800&version=stable)

I understand that crossbeam guarantees that all threads are finished and since the ownership goes back to the main thread, I also understand that there will be no outstanding borrows, but the way I see it, there could still be outstanding pending writes, if not on the CPUs, then in the caches.

Which concept guarantees that all writes are finished and all caches are synced back to the main thread when into_inner() is called? Is it possible to lose writes?

Solution

Does into_inner() return all the relaxed writes in this example program? If so, which concept guarantees this?

It's not into_inner that guarantees it, it's join.

What into_inner guarantees is that either some synchronization has been performed since the final concurrent write (join of thread, last Arc having been dropped and unwrapped with try_unwrap, etc.), or the atomic was never sent to another thread in the first place. Either case is sufficient to make the read data-race-free.

Crossbeam documentation is explicit about using join at the end of a scope:

This [the thread being guaranteed to terminate] is ensured by having the parent thread join on the child thread before the scope exits.

Regarding losing writes:

Which concept guarantees that all writes are finished and all caches are synced back to the main thread when into_inner() is called? Is it possible to lose writes?

As stated in various places in the documentation, Rust inherits the C++ memory model for atomics. In C++11 and later, the completion of a thread synchronizes with the corresponding successful return from join. This means that by the time join completes, all actions performed by the joined thread must be visible to the thread that called join, so it is not possible to lose writes in this scenario.

In terms of atomics, you can think of a join as an acquire read of an atomic that the thread performed a release store on just before it finished executing.