Search code examples
c++containersthread-safetystd

Is it thread-safe if I modify an element and insert an element on same container(e.g. std::unordered_map) from different threads?


I have a container std::unordered_map<std::string, std::vector<int>> myMap;, and a mutex std::mutex myMutex. When I try to access to myMap, I use myMutex like the codes below:

void insert(const std::string& Key)
{
  std::lock_guard<std::mutex> lock(myMutex);
  if(!myMap.contains(Key))
  {
    auto& v = myMap[Key];
    // do something to v to make it not empty
  }
}
std::vector<int>& get(const std::string& Key)
{
  std::lock_guard<std::mutex> lock(myMutex);
  static std::vector<int> empty; // this container will be read only
  if(myMap.contains(Key))
  {
    auto& v = myMap[Key];
    return v;
  }
  return empty;
}

Is it thread-safe if I try to modify the container

std::vector<int>& apples = get("Apple");
if(!apples.empty()) // check for existance
{
  //do something to apples
}

from one thread(I don't use myMutex when I try to change apples), and call the funtion

insert("AnotherApple");

from another thread?

In my case, I synchronize with insert and get, but I don't synchronize with them(especially insert, which will modify the unordered_map) when I do some change to the reference value (here is the container apples) of a specified key get from get. But I will use some other mutex(other than myMutex) to synchronize with these changes to prevent writes on the content of same key from different threads.

It seems that in https://stackoverflow.com/questions/25994312/can-i-access-a-c11-stdmap-entry-while-inserting-erasing-from-another-thread, the first answer by Yakk - Adam Nevraumont says

You can also modify the non-key component of an element of a map or set while other operations that do not read/write/destroy said element's non-key component run (which is most of them, except stuff like erase or =).

and that the statement in https://en.cppreference.com/w/cpp/container

  1. Elements of the same container can be modified concurrently with those member functions that are not specified to access these elements.

are both describing my case. But I am not really sure, can someone give me some clarification?

Edit : I am sorry about lacking of detail, here is a more detailed code. I try to do with exceptions as proposed by @Yksisarvinen though I am not familiar with it, so there may be some mistakes.

class myManager
{
public:
    myManager() = default;
    void insert(const std::string& id)
    {
        std::lock_guard<std::mutex> lock(myMutex);
        if (!myMap.contains(id))
        {
            myMap[id];// if throw exception here, then nothing happen?

            try {
                mySeparateMutex[id];
            }
            catch (...)
            {
                myMap.erase(myMap.find(id));// need to erase element with Key==id from myMap to consistent with mySeparateMutex
                throw;
            }
        }
    }
    std::tuple<std::vector<int>&, std::mutex&> get(const std::string& id, bool& ok)
    {
        std::lock_guard<std::mutex> lock(myMutex);
        static std::vector<int> Empty;
        static std::mutex noMutex;
        ok = false;
        if (myMap.contains(id))
        {
            ok = true;
            return std::tie(myMap[id], mySeparateMutex[id]);// I think operator[] will throw nothing
        }
        return std::tie(Empty, noMutex);
    }
private:
    // assume we can only access directly to these private member by public member functions above 
    std::mutex myMutex;
    std::unordered_map<std::string, std::vector<int>> myMap;
    std::unordered_map<std::string, std::mutex> mySeparateMutex;
};

myManager instance;

void fun0(const std::string& id)
{
    bool isContain = false;
    auto [a, b] = instance.get(id, isContain);
    if (isContain)
    {
        std::lock_guard<std::mutex> lock(b);// this mutex for preventing writing/reading container of same key
        // do some change
        a.push_back(1);
        a.push_back(2);
    }
}
void fun1(const std::string& id)
{
    instance.insert(id);
}
int fun2(const std::string& id, int index)
{
    bool isContain = false;
    auto [a, b] = instance.get(id, isContain);
    if (isContain)
    {
        std::lock_guard<std::mutex> lock(b);
        if (index >= 0 && index < a.size())
        {
            // do some read
            return a[index];
        }
    }
    return -1;
}

int main()
{
    fun1("apple");
    std::thread a([]() {
        fun0("apple");
        });
    std::thread b([]() {
        fun1("apple");
        });
    std::thread c([]() {
        fun2("apple", 0);
        });
    std::thread d([]() {
        fun0("apple");
        });
    // there are some other threads call fun* with id=="banana",...

    a.join();
    b.join();
    c.join();
    d.join();
    // join other threads
    return 0;

}

Solution

  • For completeness assuming that you have some

    std::mutex myMutex;
    std::unordered_map<std::string, std::vector<int>> myMap;
    

    and then

    int main() {
        insert("Apple");
        std::jthread t1([]() {
            std::vector<int>& apples = get("Apple");
            if(!apples.empty()) // check for existance
            {
                apples.push_back(70);
            }
        });
        std::jthread t2([]() {
            insert("Banana");
        });
    }
    

    There is no data race (neither logical so-called race-condition), because thanks to Iterator invalidation rules of std::unordered_map, references remain valid after the insertion, even if rehash happened (note that the same would not be true for iterators). It means that thread t2 is not touching the data used in thread t1, unless you would try to insert "Apple" again.

    As a side note, I would rather synchronize the access to myMap on the application level, not inside insert, get functions. That way you can always decide which sync granularity level is required and avoid race-conditions.

    EDIT after question edits

    Your new code with class myManager doesn't have data races (apart from visual inspection I also checked it with Helgrind and DRD), however it would be very easy to introduce such a race, should the user of your code forget e.g. about locking the borrowed mutex.

    If you later modify your code and e.g. add some myMap[id].push_back(3); in your insert method (AFTER the guarding if block), you will have a data race.

    Returning std::tie(Empty, noMutex); is ultra-hacky and very easy to misuse, user can easily forget about boolean arg check and modify Empty vector, lock noMutex, which doesn't make sense semantically.

    Your try-catch blocks are sceptical, you are focusing on the wrong problem. operator[] of unordered_map will throw only if constructors of Key or Value throw. Default std::vector and std::mutex constructors are both nothrow, only copy constructor of std::string can throw if allocation fails, which is imho highly unlikely.

    So again, this particular example is "correct", but I recommend removing synchronisation from myManager and doing it on the user side. E.g. lock the mutex (not owned by myManager) inside your fun0, fun1, fun2.