Search code examples
c++multithreadingc++11matrixstdthread

How can I raise a matrix to a power with multiple threads?


I am trying to raise a matrix to a power with multiple threads, but I am not very good with threads. Also I enter the number of threads from keyboard and that number is in range [1, matrix height], then I do the following:

unsigned period = ceil((double)A.getHeight() / threadNum);
unsigned prev = 0, next = period;
for (unsigned i(0); i < threadNum; ++i) {
        threads.emplace_back(&power<long long>, std::ref(result), std::ref(A), std::ref(B), prev, next, p);

        if (next + period > A.getHeight()) {
            prev = next;
            next = A.getHeight();
        }
        else {
            prev = next;
            next += period;
        }
    }

It was easy for me to multiply one matrix by another with multiple threads, but here the problem is that once 1 step is done, for example I need to raise A to the power of 3, A^2 would be that one step, after that step I have to wait for all the threads to finish up, before moving on to doing A^2*A. How can I make my threads wait for that? I'm using std::thread's.

After the first reply was posted I realized that I forgot to mention that I want to create those threads only once, and not recreate them for each multiplication step.


Solution

  • I would suggest using condition_variable.

    Algorithm would be something like this:

    1. Split the matrix in N parts for N threads.

    2. Each thread calculates the necessary resulting sub matrix for a single multiplication.

    3. Then it increments an atomic threads_finished counter using fetch_add and waits on a shared condition variable.

    4. Last thread that finishes (fetch_add()+1 == thread count), notifies all threads, that they can now continue processing.

    5. Profit.

    Edit: Here is and example how to stop threads:

    #include <iostream>
    #include <thread>
    #include <mutex>
    #include <condition_variable>
    #include <vector>
    #include <algorithm>
    #include <atomic>
    
    void sync_threads(std::condition_variable & cv, std::mutex & mut, std::vector<int> & threads, const int idx) {
        std::unique_lock<std::mutex> lock(mut);
        threads[idx] = 1; 
        if(std::find(threads.begin(),threads.end(),0) == threads.end()) {
            for(auto & i: threads)
                i = 0;
            cv.notify_all();
        } else {
            while(threads[idx])
                cv.wait(lock);
        }
    }
    
    int main(){
    
        std::vector<std::thread> threads;
    
        std::mutex mut;
        std::condition_variable cv;
    
        int max_threads = 10;
        std::vector<int> thread_wait(max_threads,0);
    
        for(int i = 0; i < max_threads; i++) {
            threads.emplace_back([&,i](){
                    std::cout << "Thread "+ std::to_string(i)+" started\n";
                    sync_threads(cv,mut,thread_wait,i);
                    std::cout << "Continuing thread " + std::to_string(i) + "\n";
                    sync_threads(cv,mut,thread_wait,i);
                    std::cout << "Continuing thread for second time " + std::to_string(i) + "\n";
    
                });
        }
    
        for(auto & i: threads)
            i.join();
    }
    

    The interesting part is here:

    void sync_threads(std::condition_variable & cv, std::mutex & mut, std::vector<int> & threads, const int idx) {
        std::unique_lock<std::mutex> lock(mut); // Lock because we want to modify cv
        threads[idx] = 1; // Set my idx to 1, so we know we are sleeping
        if(std::find(threads.begin(),threads.end(),0) == threads.end()) {
            // I'm the last thread, wake up everyone
            for(auto & i: threads)
                i = 0;
            cv.notify_all();
        } else { //I'm not the last thread - sleep until all are finished
            while(threads[idx]) // In loop so, if we wake up unexpectedly, we go back to sleep. (Thanks for pointing that out Yakk)
                cv.wait(lock);
        }
    }