I am coding a program in C++ MPI but when passing a large file as stdin
I am facing the problem that the threads are not seeing the same stdin
information.
More elaborated, I am passing as standard input a list of input files, which is then stored in a vector<string>
:
MPI_Init(NULL,NULL);
int CORES, thread;
MPI_Comm_size(MPI_COMM_WORLD,&CORES);
MPI_Comm_rank(MPI_COMM_WORLD,&thread);
stringstream tline;
int count = 0;
for (std::string line; std::getline(std::cin, line);){
tline << line << " ";
count++;
}
vector<string> args(count,"");
for(int i = 0; i < count; i++)
tline >> args[i];
cout << thread << " " << count << endl; //each thread outputs the number of input files it received
My problem is that this gives different numbers for different threads. For instance, after passing a file of 10 000 lines, I get:
5 9464
6 9464
3 9464
4 9464
1 9554
2 9554
0 10000
7 9464
Is it because of some overheading? How can I avoid that?
Ok so basically your problem is that you all your threads are consuming lines from cin and they race. Even though cin gives some guarantees for thread safeness in general you are not always certain what you would get. Check this thread: How do scanf(), std::cin behave on multithreaded environment?
Solution: Don't use CIN? Use a file and have each thread open the file on its own with a filehandle. If you really want to use cin then have one thread from MPI read CIN and broadcast it to other threads and then they can whatever they want with it.