I am trying to raise a matrix to a power with multiple threads, but I am not very good with threads. Also I enter the number of threads from keyboard and that number is in range [1, matrix height], then I do the following:
unsigned period = ceil((double)A.getHeight() / threadNum);
unsigned prev = 0, next = period;
for (unsigned i(0); i < threadNum; ++i) {
threads.emplace_back(&power<long long>, std::ref(result), std::ref(A), std::ref(B), prev, next, p);
if (next + period > A.getHeight()) {
prev = next;
next = A.getHeight();
}
else {
prev = next;
next += period;
}
}
It was easy for me to multiply one matrix by another with multiple threads, but here the problem is that once 1 step is done, for example I need to raise A to the power of 3, A^2 would be that one step, after that step I have to wait for all the threads to finish up, before moving on to doing A^2*A. How can I make my threads wait for that? I'm using std::thread's.
After the first reply was posted I realized that I forgot to mention that I want to create those threads only once, and not recreate them for each multiplication step.
I would suggest using condition_variable.
Algorithm would be something like this:
Split the matrix in N parts for N threads.
Each thread calculates the necessary resulting sub matrix for a single multiplication.
Then it increments an atomic threads_finished
counter using fetch_add
and waits on a shared condition variable.
Last thread that finishes (fetch_add()+1 == thread count), notifies all threads, that they can now continue processing.
Edit: Here is and example how to stop threads:
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <vector>
#include <algorithm>
#include <atomic>
void sync_threads(std::condition_variable & cv, std::mutex & mut, std::vector<int> & threads, const int idx) {
std::unique_lock<std::mutex> lock(mut);
threads[idx] = 1;
if(std::find(threads.begin(),threads.end(),0) == threads.end()) {
for(auto & i: threads)
i = 0;
cv.notify_all();
} else {
while(threads[idx])
cv.wait(lock);
}
}
int main(){
std::vector<std::thread> threads;
std::mutex mut;
std::condition_variable cv;
int max_threads = 10;
std::vector<int> thread_wait(max_threads,0);
for(int i = 0; i < max_threads; i++) {
threads.emplace_back([&,i](){
std::cout << "Thread "+ std::to_string(i)+" started\n";
sync_threads(cv,mut,thread_wait,i);
std::cout << "Continuing thread " + std::to_string(i) + "\n";
sync_threads(cv,mut,thread_wait,i);
std::cout << "Continuing thread for second time " + std::to_string(i) + "\n";
});
}
for(auto & i: threads)
i.join();
}
The interesting part is here:
void sync_threads(std::condition_variable & cv, std::mutex & mut, std::vector<int> & threads, const int idx) {
std::unique_lock<std::mutex> lock(mut); // Lock because we want to modify cv
threads[idx] = 1; // Set my idx to 1, so we know we are sleeping
if(std::find(threads.begin(),threads.end(),0) == threads.end()) {
// I'm the last thread, wake up everyone
for(auto & i: threads)
i = 0;
cv.notify_all();
} else { //I'm not the last thread - sleep until all are finished
while(threads[idx]) // In loop so, if we wake up unexpectedly, we go back to sleep. (Thanks for pointing that out Yakk)
cv.wait(lock);
}
}