Search code examples
c++multithreadingtemplatesparallel-processingheader-files

How do I pass a template function to a thread within the same .cpp file?


I have an assignment to implement a parallel version of the longest common subsequence algorithm (just calculating the LCS length). The program must use threads in order to complete the task as quickly as possible (at least, faster than a sequential implementation). Ideally, it should also utilize TLS in the threads. We had a similar assignment that implemented a template within a .hpp file and I want to use the same template, but it does not seem like I can use a .hpp file in this assignment. My problem lies in passing the template function to my threads. Below is the code:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <iostream>
#include <unistd.h>
#include <chrono>
#include <thread>
#include <functional>


#ifdef __cplusplus
extern "C" {
#endif

  void generateLCS(char* X, int m, char* Y, int n);
  void checkLCS(char* X, int m, char* Y, int n, int result);

#ifdef __cplusplus
}
#endif


class ParFor {
public:
template<typename TLS>
  void parfor(size_t beg, size_t endm, size_t endn, size_t increment,
           std::function<void(TLS&)> before,
           std::function<void(int, int, TLS&)> f,
           std::function<void(TLS&)> after
           ) {
    TLS tls;
    before(tls);
    for (size_t a=beg; a<endm; a+= increment) {
      for (size_t b=beg; b<endn; b+= increment) {
        f(a, b, tls);
      }
    }
    after(tls);
  }
};


int main (int argc, char* argv[]) {

  if (argc < 4) { std::cerr<<"usage: "<<argv[0]<<" <m> <n> <nbthreads>"<<std::endl;
    return -1;
  }

  int m = atoi(argv[1]);
  int n = atoi(argv[2]);
  int nbthreads = atoi(argv[3]);

  // get string data
  char *X = new char[m];
  char *Y = new char[n];
  generateLCS(X, m, Y, n);

  int result = 0; // length of common subsequence
  std::vector<std::thread> threads (nbthreads);
  int mSubset = m / nbthreads;
  int nSubset = n / nbthreads;
  ParFor pf;

  std::chrono::time_point<std::chrono::system_clock> start = std::chrono::system_clock::now();
    for(int j = 0; j < nbthreads; j++) {
            threads.push_back(std::thread(&ParFor::parfor, &pf, 0, mSubset, nSubset, 1,
                 [&](int& tls) -> void{
                   tls = result;
                 },
                 [&](int a, int b, int& tls) -> void{
                          if (X[a] == Y[b])
                            tls++;
                  },
                 [&](int tls) -> void{
                   result += tls;
                 }));
        }

    for(auto& t: threads)
            t.join();

  std::chrono::time_point<std::chrono::system_clock> end = std::chrono::system_clock::now();
  std::chrono::duration<double> elapsed_seconds = end-start;

  if(m < 10 || n < 10)
    result = 0;

  checkLCS(X, m, Y, n, result);

  std::cerr<<elapsed_seconds.count()<<std::endl;

  delete[] X;
  delete[] Y;

  return 0;
}

The "class ParFor" and everything from "int result" down to the end is what I added, the rest is pre-written code from the TA. If more clarification is needed, please let me know. Thank you.


Solution

  • You are right, you cannot pass a function template as if it was a function. They are different things, just like a cookie cutter is not a cookie.

    You have two main problems in your code.

    First, since ParFor::parfor is a template, you can only take a member function pointer to it if you provide template parameters that match what your lambdas use for TLS, so change it to this (for example):

    &ParFor::parfor<int> 
    

    Your second problem is trying to pass a lambda to a function template, and expect deduction to say "this lambda is convertible to a std::function, so it's a match, deduce from that". For deduction, it needs to match the type exactly, and a lambda is not a std::function. If you pass in std::function objects, then it can deduce the template parameters.

    So change the loop creating threads to wrap the lambdas before passing them, and it will compile. There's too much going on in your code for me to want to dig into the details beyond this, so if there are other bugs, those are still yours. :)

    for(int j = 0; j < nbthreads; j++) {
        threads.push_back(std::thread(
            &ParFor::parfor<int>, &pf, 0, mSubset, nSubset, 1,
            std::function{[&](int& tls) -> void{
                tls = result;
            }},
            std::function{[&](int a, int b, int& tls) -> void {
                if (X[a] == Y[b])
                    tls++;
            }},
            std::function{[&](int tls) -> void{
                result += tls;
            }}));
        }
    

    also, the trailing return type designating your lambdas return void is redundant. You can simply remove the -> void without changing anything.