I have two global external variables. I want each thread to have its own private copy of one, and a shared copy of the other. The objective is to have each thread work on the private version, and then combine information into the second.
This may seem like overkill but it's the MWE of what I'm really doing:
The external variables are defined in Globals.h file:
extern vector<int> TestVector;
extern vector<vector<int>> CombinedTestVector;
And in the Globals.cpp file:
vector<int> TestVector;
vector<vector<int>> CombinedTestVector;
Executed MyFunc.cpp (with many includes, including Globals.h)
void MyFunc(int NumberOfIterations)
{
int iter;
SetupTestVector();
cout << "Test vector initially looks like:\n";
PrintTestVector();
# pragma omp parallel for default(shared) private(iter) firstprivate(TestVector)
for (iter = 0; iter < NumberOfIterations; iter++)
{
int MyThreadNum = omp_get_thread_num();
RemoveOneFromYourThreadIndex(MyThreadNum);
# pragma omp critical
{
cout << " Thread " << MyThreadNum << ", TestVector now looks like:\n";
cout << " ";
PrintTestVector();
}
CombinedTestVector.push_back(TestVector);
}
# pragma omp barrier
cout << "Combined vector looks like:\n";
PrintCombinedVector();
}
///////////////////////////////////////////////////////////////////////
void SetupTestVector()
{
TestVector.push_back(10);
TestVector.push_back(11);
TestVector.push_back(12);
TestVector.push_back(13);
TestVector.push_back(14);
}
///////////////////////////////////////////////////////////////////////
void PrintTestVector()
{
for (size_t i = 0; i < TestVector.size(); i++)
{
cout << TestVector.at(i) << " ";
}
cout << "\n";
}
///////////////////////////////////////////////////////////////////////
void RemoveOneFromYourThreadIndex(int ThreadIndex)
{
TestVector.at(ThreadIndex) -= 1;
TestVector.push_back(ThreadIndex);
}
///////////////////////////////////////////////////////////////////////
void PrintCombinedVector()
{
for (size_t i = 0; i < CombinedTestVector.size(); i++)
{
for (size_t j = 0; j < CombinedTestVector.at(i).size(); j++)
{
cout << CombinedTestVector.at(i).at(j) << " ";
}
cout << "\n";
}
cout << "\n";
}
If I execute this with NumberOfIterations = 1, I get:
Test vector initially looks like:
10 11 12 13 14
Thread 0, TestVector now looks like:
9 11 12 13 14 0
Combined vector looks like:
10 11 12 13 14
So you can see that the combined vector isn't getting back what I asked for... If execute with 3 threads, it's even worse:
Test vector initially looks like:
10 11 12 13 14 Thread 2, TestVector now looks like:
9 10 11 13 14 2 0 1
Thread 0, TestVector now looks like:
9 10 11 13 14 2 0 1
Thread 1, TestVector now looks like:
9 10 11 13 14 2 0 1
Combined vector looks like:
10 11 12 13 14
10 10 11 13 14 2
10 11 12 13 14
How can I get the intended behavior, which is to get out:
Test vector initially looks like:
10 11 12 13 14 Thread 0, TestVector now looks like:
9 11 12 13 14 0
Thread 1, TestVector now looks like:
10 10 12 13 14 1
Thread 1, TestVector now looks like:
10 11 11 13 14 2
Combined vector looks like:
9 11 12 13 14 0
10 10 12 13 14 1
10 11 11 13 14 2
There seems to be a misunderstanding here of what firstprivate
and related actually do. What they don't do is magically make the global variable refer to a different copy in each thread. Instead, they give you a copy in the local scope (that is copied from the global one if you use firstprivate
).
Think about it like this:
# pragma omp parallel for default(shared) private(iter) firstprivate(TestVector)
for (iter = 0; iter < NumberOfIterations; iter++)
{
thread_local vector<int> TestVector = ::TestVector; /* OMP MAGIC */
int MyThreadNum = omp_get_thread_num();
// etc. ...lots of stuff that completely ignores your local copy
// (The function calls still operate on the global TestVector!)
CombinedTestVector.push_back(TestVector);
}
The problem is clear then: In RemoveOneFromYourThreadIndex
you are referencing the global TestVector
, not the one that lives in the scope of your section with firstprivate
. You need to pass that local instance into RemoveOneFromYourThreadIndex
(and all the other functions that are supposed to operate on a TestVector
).
Also, CombinedTestVector.push_back(TestVector);
is not synchronized and thus a race condition, you should fix that too.
Misc comments regarding code quality:
using namespace std;
.Globals.h
is extremely concerning.