Search code examples
c++tbb

Thread Building Blocks segmentation fault


I am trying to debug a C++ code using Intel Thread Building Blocks, following the procedure here - Debugging in threading building Blocks.

I tried running the code with one thread, and with TBB_USE_DEBUG set to 1 (I asked a previous question about it here - Using Intel TBB in debug mode). However, I get a weird segmentation fault. Here is the gdb backtrace.

#0  0x00007ffff793fbc1 in ?? () from /usr/lib/x86_64-linux-gnu/libtbb.so.2
#1  0x000000000040d2cb in tbb::task::spawn_and_wait_for_all (child=..., this=0x7ffff63b7a40)
    at /usr/include/tbb/task.h:728
#2  MpsTask1::execute (this=0x7ffff63b7a40)
    at /capps/mps_implementations.hpp:102
#3  0x00007ffff793ffdd in ?? () from /usr/lib/x86_64-linux-gnu/libtbb.so.2
#4  0x000000000040d2cb in tbb::task::spawn_and_wait_for_all (child=..., this=0x7ffff63b7d40)
    at /usr/include/tbb/task.h:728
#5  MpsTask1::execute (this=0x7ffff63b7d40)
    at /capps/mps_implementations.hpp:102

I am quite puzzled by this backtrace, because I cannot control what happens inside the library. Is it possible that a mistake I have made causes spawn_root_and_wait to fail ?

Here is my code (without constructor and destructor to keep it short). Its purpose is to compute the maximum prefix sum of an array via a reduction operation. It recursively divides the array until it the chunks are small enough, and then join the results. I know I could just use TBB parallel_reduce template for this problem, but my goal is to understand how TBB task based programming works.

class MpsTask1: public task {
public:

    task* execute(){
        if(size <= Cutoff){

            for(int i = left; i != right; i++){
                *sum = *sum + array[i]);

                if (*sum <= *mps){
                    *mps = *sum;
                    *position = i+1; 
                }
            }

            memo[depth][index] = cutoff;

        }else{
            // Parameters for subtasks
            int middle = (right+left)/2;
            int sizel = middle - left;
            int sizer = right - middle;
            int newDepth = depth + 1;
            int lIndex = 2*index;
            int rIndex = lIndex + 1;

            // Variables for results
            int lPos = left;
            int rPos = middle;
            double lsum, rsum, lmps, rmps;

            // Create subtasks
            set_ref_count(3);

            MpsTask1& lTask = *new(allocate_child()) MpsTask1(Cutoff,array,sizel,&lsum,&lmps,&lPos,memo,newDepth,lIndex,left,middle);
            spawn(lTask);

            MpsTask1 &rTask = *new(allocate_child()) MpsTask1(Cutoff,array,sizer,&rsum, &rmps,&rPos,memo,newDepth,rIndex,middle,right);

            spawn_and_wait_for_all(rTask);

            // Join results
            rmps = lsum+rmps;
            *sum = lsum+rsum;
            if(*mps <= rmps){
                *mps = rmps;
                *position = rPos;
                memo[depth][index] = rightChild;
            }
            else{
                *mps_interval = lmps;
                *position = lPos;
                memo[depth][index] = leftChild;
            }
            return NULL;
        }
    }

private:
    /* Below this size, the mps and sum are computed sequentially */
    int Cutoff;
    /* Input array and its size */
    double* array;
    int size;    
    /* Identification of the task */
    int depth;
    int index;
    int left;
    int right;
    /* Intervals for sum and mps */
    double* sum;
    double* mps;
    /* Position of the mps */
    int* position;
    // Status : left child, right child, or cutoff
    Status** memo;
};

void parallel_mps(double* array, int size, int Cutoff){
    // Create variables for result
    double sum = 0., mps = 0.;
    int position = 0;

    // Initialization of memo
    ....
    // Finished initialization of memo


    MpsTask1& root = *new(task::allocate_root()) MpsTask1(Cutoff,array,size,&sum,&mps,&position,memo);

    task::spawn_root_and_wait(root);
}

Solution

  • I finally solved it. It came indeed from my code. When spawn_and_wait_for_all() is called, it calls the execute() method of the previously spawned tasks, but gdb seems unable to show that in the backtrace. What worked for me is to set the number of threads to 1 and do the good old cout << endl << "Checkpoint" << endl debugging. In the present case, I forgot to return in the first if of execute(). Any weakness of the code seems to have dramatic consequences with TBB.