Search code examples
rustconcurrency

How can I share the data without locking whole part of it?


Consider the following scenarios.

let mut map = HashMap::new();
map.insert(2,5);    
thread::scope(|s|{
    s.spawn(|_|{
        map.insert(1,5);
    });
    s.spawn(|_|{
        let d = map.get(&2).unwrap();
    });
}).unwrap();

This code cannot be compiled because we borrow the variable map mutably in h1 and borrow again in h2. The classical solution is wrapping map by Arc<Mutex<...>>. But in the above code, we don't need to lock whole hashmap. Because, although two threads concurrently access to same hashmap, they access completely different region of it.

So I want to share map through thread without using lock, but how can I acquire it? I'm also open to use unsafe rust...


Solution

  • in the above code, we don't need to lock whole hashmap

    Actually, we do.

    Every insert into the HashMap may possibly trigger its reallocation, if the map is at that point on its capacity. Now, imagine the following sequence of events:

    • Second thread calls get and retrieves reference to the value (at runtime it'll be just an address).
    • First thread calls insert.
    • Map gets reallocated, the old chunk of memory is now invalid.
    • Second thread dereferences the previously-retrieved reference - boom, we get UB!

    So, if you need to insert something in the map concurrently, you have to synchronize that somehow.

    For the standard HashMap, the only way to do this is to lock the whole map, since the reallocation invalidates every element. If you used something like DashMap, which synchronizes access internally and therefore allows inserting through shared reference, this would require no locking from your side - but can be more cumbersome in other parts of API (e.g. you can't return a reference to the value inside the map - get method returns RAII wrapper, which is used for synchronization), and you can run into unexpected deadlocks.