Search code examples
c++cilk

Porting vector (Intel Cilk Plus) or list (GCC Cilk Plus) reducer to OpenCilk 2.1


I am unable to find examples that explain how to port lines of code that use the reducer_vector class in Intel Cilk Plus or that use the reducer_list class in GCC Cilk Plus to OpenCilk 2.1 (provides the cilk_reducer keyword).

For instance, consider the following lines of code (assuming GCC Cilk Plus) shown as an example in David Bindel's slides (Beyond C/C++) https://www.cs.cornell.edu/~bindel/class/cs5220-f15/slides/2015-11-17-languages.pdf:

#include <cilk/cilk.h>
#include <cilk/reducer_list.h>

void reducer_list_test() {
  using namespace std;
  cilk::reducer< cilk::op_list_append<char> > letters_reducer;
  // Build the list in parallel
  cilk_for (char ch = ’a’; ch <= ’z’; ch++) {
    simulated_work();
    letters_reducer->push_back(ch);
  }
  // Reducer result as a standard STL list then output
  const list<char> &letters = letters_reducer.get_value();
  cout << "Letters from reducer_list:";
  for (auto i = letters.begin(); i != letters.end(); i++)
    cout << " " << *i;
  cout << endl;
}

Would someone provide an example to port the above lines of code to OpenCilk 2.1?


Solution

  • This works for me, I am using std::list::splice for appends, but you can also use std::vector::insert or std::deque::insert (probably faster since it would have less overhead, it all depends on the type of the objects you are appending).

    #include <cilk/cilk.h>
    
    #include <list>
    #include <iostream>
    
    using vec = std::list<char>;
    template<class C>
    void print_vl(const C& v)
    {
      for (auto i = v.begin(); i != v.end(); i++) std::cout << " " << *i;
    }
    
    void identity(void* view) { new (view) vec(); }
    void append(void* left, void* right)
    {
        auto& l = *static_cast<vec*>(left);
        auto& r = *static_cast<vec*>(right);
        std::cout << "Inserting ";
        print_vl(r);
        std::cout << ", into ";
        print_vl(l);
        l.splice(l.end(), r);
        std::cout << ". Result: ";
        print_vl(l);
        std::cout << "\n";
        r.~vec();
    }
    
    void reducer_list_test()
    {
        vec cilk_reducer(identity, append) letters_reducer{};
        // Build the list in parallel
        cilk_for(char ch = 'a'; ch <= 'z'; ch++)
        {
            // simulated_work();
            letters_reducer.push_back(ch);
        }
        // Reducer result as a standard STL list then output
        const std::list<char> letters(letters_reducer.begin(), letters_reducer.end());
        std::cout << "Letters from reducer_list:";
        print_vl(letters);
        std::cout << std::endl;
    }
    
    int main() { reducer_list_test(); }
    

    When compiling with cilk sanitizer:

    clang++ -fopencilk -fsanitize=cilk -O3 test.cpp

    I get the following output:

    ./a.out                                                                                            
    Running Cilksan race detector.
    Inserting  b, into  a. Result:  a b
    Inserting  c, into  a b. Result:  a b c
    Inserting  d, into  a b c. Result:  a b c d
    Inserting  e, into  a b c d. Result:  a b c d e
    Inserting  f, into  a b c d e. Result:  a b c d e f
    Inserting  g, into  a b c d e f. Result:  a b c d e f g
    Inserting  h, into  a b c d e f g. Result:  a b c d e f g h
    Inserting  i, into  a b c d e f g h. Result:  a b c d e f g h i
    Inserting  j, into  a b c d e f g h i. Result:  a b c d e f g h i j
    Inserting  k, into  a b c d e f g h i j. Result:  a b c d e f g h i j k
    Inserting  l, into  a b c d e f g h i j k. Result:  a b c d e f g h i j k l
    Inserting  m, into  a b c d e f g h i j k l. Result:  a b c d e f g h i j k l m
    Inserting  n, into  a b c d e f g h i j k l m. Result:  a b c d e f g h i j k l m n
    Inserting  o, into  a b c d e f g h i j k l m n. Result:  a b c d e f g h i j k l m n o
    Inserting  p, into  a b c d e f g h i j k l m n o. Result:  a b c d e f g h i j k l m n o p
    Inserting  q, into  a b c d e f g h i j k l m n o p. Result:  a b c d e f g h i j k l m n o p q
    Inserting  r, into  a b c d e f g h i j k l m n o p q. Result:  a b c d e f g h i j k l m n o p q r
    Inserting  s, into  a b c d e f g h i j k l m n o p q r. Result:  a b c d e f g h i j k l m n o p q r s
    Inserting  t, into  a b c d e f g h i j k l m n o p q r s. Result:  a b c d e f g h i j k l m n o p q r s t
    Inserting  u, into  a b c d e f g h i j k l m n o p q r s t. Result:  a b c d e f g h i j k l m n o p q r s t u
    Inserting  v, into  a b c d e f g h i j k l m n o p q r s t u. Result:  a b c d e f g h i j k l m n o p q r s t u v
    Inserting  w, into  a b c d e f g h i j k l m n o p q r s t u v. Result:  a b c d e f g h i j k l m n o p q r s t u v w
    Inserting  x, into  a b c d e f g h i j k l m n o p q r s t u v w. Result:  a b c d e f g h i j k l m n o p q r s t u v w x
    Inserting  y, into  a b c d e f g h i j k l m n o p q r s t u v w x. Result:  a b c d e f g h i j k l m n o p q r s t u v w x y
    Inserting  z, into  a b c d e f g h i j k l m n o p q r s t u v w x y. Result:  a b c d e f g h i j k l m n o p q r s t u v w x y z
    Letters from reducer_list: a b c d e f g h i j k l m n o p q r s t u v w x y z
    
    Cilksan detected 0 distinct races.
    Cilksan suppressed 0 duplicate race reports.
    

    Without the sanitizer I get something more optimized, but correct nonetheless:

    ./a.out                                                                                              
    Inserting  q, into  n o p. Result:  n o p q
    Inserting  r, into  n o p q. Result:  n o p q r
    Inserting  s, into  n o p q r. Result:  n o p q r s
    Inserting  n o p q r Inserting  sw, into  a b  xc , into d e f g h  ti j k l  u v. Result:  mt. Result:   a b c d e u v fw x
    Inserting  y, into  t u v w x. Result:  t u v w x y
     g h i j k l m n o p q r s
    Inserting  t u v w x y, into  a b c d e f g h i j k l m n o p q r s. Result:  a b c d e f g h i j k l m n o p q r s t u v w x y
    Inserting  z, into  a b c d e f g h i j k l m n o p q r s t u v w x y. Result:  a b c d e f g h i j k l m n o p q r s t u v w x y z
    Letters from reducer_list: a b c d e f g h i j k l m n o p q r s t u v w x y z
    

    A faster (I haven't benchmarked) version using deque would be like follows:

    #include <cilk/cilk.h>
    
    #include <deque>
    #include <iostream>
    
    using vec = std::deque<char>;
    template<class C>
    void print_vl(const C& v)
    {
      for (auto i = v.begin(); i != v.end(); i++) std::cout << " " << *i;
    }
    
    void identity(void* view) { new (view) vec(); }
    void append(void* left, void* right)
    {
        auto& l = *static_cast<vec*>(left);
        auto& r = *static_cast<vec*>(right);
        l.insert(l.end(), r.begin(), r.end());
        r.~vec();
    }
    
    void reducer_list_test()
    {
        vec cilk_reducer(identity, append) letters_reducer{};
        // Build the list in parallel
        cilk_for(char ch = 'a'; ch <= 'z'; ch++)
        {
            // simulated_work();
            letters_reducer.push_back(ch);
        }
        std::cout << "Letters from reducer_list:";
        print_vl(letters_reducer);
        std::cout << std::endl;
    }
    
    int main() { reducer_list_test(); }
    

    Notice that deque implementation performance varies a lot between standard library implementations. Have a look here.