Search code examples
c++multithreadingthread-safetyatomic

cpp: how to make access of vector in class thread-safe?


I posted something in a similar direction yesterday, but that question was specifically about mutexes and I did not find much answer in the referenced "duplicate" thread. I want to try to ask more generally now, I hope thats okay.

Look at this code:

#include <iostream>
#include <mutex>
#include <vector>
#include <initializer_list>
using namespace std;

class Data {
public:

    void write_data(vector<float>& data) {
        datav = move(data);
    }

    vector<float>* read_data() {
        return(&datav);
    }

    Data(vector<float> in) : datav{ in } {};

private:
    vector<float> datav{};
};

void f1(vector<Data>& in) {
    for (Data& tupel : in) {
        vector<float>& in{ *(tupel.read_data()) };
        for (float& f : in) {
            f += (float)1.0;
        };
    };
}

void f2(vector<Data>& in) {
    for (Data& tupel : in) {
        vector<float>& in{ *(tupel.read_data()) };
        for (float& f : in) {
            cout << f << ",";
        };
    };
}
int main() {
    vector<Data> datastore{};
    datastore.emplace_back(initializer_list<float>{ 0.2, 0.4 });
    datastore.emplace_back(initializer_list<float>{ 0.6, 0.8 });
    vector<float> bigfv(50, 0.3);
    Data demo{ bigfv };
    datastore.push_back(demo);
    thread t1(f1, ref(datastore));
    thread t2(f2, ref(datastore));
    t1.join();
    t2.join();
};

In my expectancy, I would have guessed that I will get a wild mixture of output values depending on which thread first got to the vector value, so in the third "demo" vector with 50x0.3f, I would have expected a mixture of 0.3 (t2 got there first) and 1.3 (t1 got it first) as output. Even as I tried to use as much pass-by-reference, direct pointers, etc as possible to avoid copying (the original project uses quite large data amounts), the code behaves defined (always t2, then t1 access). Why? Don't I access the floats directly by reference in both thread functions?

How would you make this vector access well-defined? The only possible solutions I found in the other thread were:

-define a similar sized array of unique_ptr to mutexes (feels bad because I need to be able to add data containers to the datastore, so that would mean clearing the array and rebuilding it every time I change size of the datastore?), or

-make the access to the vector atomic (which as a thought makes my operation as I want it threadsafe, but there is no atomic invariant for a vector, or is there in some non-STL-lib?), or

-write a wrapper for a mutex in the data class?

It's for my project not important which thread accesses first, it's only important that I can definitifly read/write the whole vector into the data tupel by a thread without another thread manipulating the dataset at the same time.


Solution

  • I believe I did this now with reference to Sam's comments, and it seems to work, is this correct?

    #include <iostream>
    #include <mutex>
    #include <vector>
    #include <initializer_list>
    using namespace std;
    
    class Data {
    public:
        unique_ptr<mutex> lockptr{ new mutex };
        void write_data(vector<float>& data) {
            datav = move(data);
        }
    
        vector<float>* read_data() {
            return(&datav);
        }
    
        Data(vector<float> in) : datav{ in } {
        };
        Data(const Data&) = delete;
        Data& operator=(const Data&) = delete;
        Data(Data&& old) {
            datav = move(old.datav);
            unique_ptr<mutex> lockptr{ new mutex };
        }
        Data& operator=(Data&& old) {
            datav = move(old.datav);
            unique_ptr<mutex> lockptr{ new mutex };
        }
    private:
        vector<float> datav{};
        //mutex lock{};
    
    };
    
    void f1(vector<Data>& in) {
        for (Data& tupel : in) {
            unique_lock<mutex> lock(*(tupel.lockptr));
            vector<float>& in{ *(tupel.read_data()) };
            for (float& f : in) {
                f += (float)1.0;
            };
        };
    }
    
    void f2(vector<Data>& in) {
        for (Data& tupel : in) {
            (*(tupel.lockptr)).try_lock();
            vector<float>& in{ *(tupel.read_data()) };
            for (float& f : in) {
                cout << f << ",";
            };
            (*(tupel.lockptr)).unlock();
        };
    }
    int main() {
        vector<Data> datastore{};
        datastore.emplace_back(initializer_list<float>{ 0.2, 0.4 });
        datastore.emplace_back(initializer_list<float>{ 0.6, 0.8 });
        vector<float> bigfv(50, 0.3);
        Data demo{ bigfv };
        datastore.push_back(move(demo));
        thread t1(f1, ref(datastore));
        thread t2(f2, ref(datastore));
        t1.join();
        t2.join();
    };
    

    By usage of the unique_ptr, I should leave no memory leaks when I move the instance, right?