Search code examples
c++stdstdvector

Filling std::vector by different threads


I need fill one std::vector on differents threads.

Is it correct code? Or I should to add mutex for my code?

void func(int i, std::vector<float>& vec)
{
    vec[i] = i;
}

int main()
{
    std::vector<float> vec(6);
    std::list<std::thread> threads;
    for (int i = 0; i < 6; i++)
    {
        threads.push_back(std::thread(func, i, std::ref(vec)));
    }
    for (auto iter = threads.begin(); iter != threads.end(); iter++)
    {
      (*iter).join();
    }
}

I tested my code it works fine. Are there any pitfalls? Is it thread safe code?

What about get std::vector data by different threads?


Solution

  • Related question:
    Is std::vector thread-safe and concurrent by default? Why or why not?.

    It's thread-safe because you're not modifying the vector size, and not attempting to write to the same memory location in different threads.

    To future proof this answer for anyone who doesn't drill down into the link:

    1. It's not thread safe only because they're using the [] operator. It's thread safe because each thread is explicitly modifying a different location in memory.

    2. If all threads were only reading the same location using [], that would be thread safe.

    3. If all threads were writing to the same same location, using [] won't stop them from messing with each other.

    4. I think if this were production code, at LEAST a comment describing why this is thread safe would be called for. Not sure of any compile time way to prevent someone from shooting themselves in the foot if they modify this function.


    On point #4, we want to communicate to future users of this code that:

    1. No we're not guarding this standard library container even though that should be your gut reaction, and
    2. Yes we've analyzed it and it's safe.

    The easy way is to stick a comment in there, but there's a saying:

    The compiler doesn't read comments and neither do I.
    -Bjarne Stroustrup

    I think some kind of [[attributes]] should be the way to do this? Although the built-ins don't seem to support any kind of thread safety checking.


    Clang appears to provide Thread Safety Analysis:

    The analysis is still under active development, but it is mature enough to be deployed in an industrial setting.

    Assuming you implement other functions that would require a std::mutex to be responsible for your std::vector:

    std::mutex _mu;
    std::vector<int> _vec GUARDED_BY(_mu);
    

    then you can explicitly add the NO_THREAD_SAFETY_ANALYSIS attribute to turn off the safety checking for this one specific function. I think it's best to combine this with a comment:

    // I know this doesn't look safe but it is as long as
    // the caller always launches it with different values of `i`
    void foo(int i, std::vector<int>& vec) NO_THREAD_SAFETY_ANALYSIS;
    

    The use of GUARDED_BY tells me, in the future, that you are thinking about thread safety. The use of NO_THREAD_SAFETY_ANALYSIS shows me that you have determined this function is okay to use - especially when other functions that modify your vector are not marked NO_THREAD_SAFETY_ANALYSIS.

    2024 Edit

    This question doesn't ask about accessing the data after calling std::thread::join on the threads. Which you would obviously do in a normal program. It is not a data race to read the data after joining the threads. See Is it a data race to access shared data after joining a thread?.

    The calls to join() guarantees that all accesses in the threads happens before any accesses after the join() calls.