Search code examples
c++tbb

Intel TBB program does not terminate, possible misuuse of reference counter


I have the following TBB code snippet. The code does not work as expected.

#include <iostream>
#include <tbb/mutex.h>
#include <tbb/tbb.h>

using namespace std;
using namespace tbb;

tbb::mutex printLock;

class Task1 : public task {
public:
  task* execute() {
    printLock.lock();
    cout << "Task 1 start" << std::endl;
    printLock.unlock();

    printLock.lock();
    cout << "Task 1 end" << std::endl;
    printLock.unlock();

    return NULL;
  }
};

class Task2 : public task {
public:
  task* execute() {
    printLock.lock();
    cout << "Task 2 start" << std::endl;
    printLock.unlock();

    printLock.lock();
    cout << "Task 2 end" << std::endl;
    printLock.unlock();

    return NULL;
  }
};

class Task5 : public task {
public:
  task* execute() {
    printLock.lock();
    std::cout << "Task 5 start" << std::endl;
    printLock.unlock();

    printLock.lock();
    std::cout << "T5:Sleep start" << std::endl;
    printLock.unlock();
    sleep(10);
    printLock.lock();
    std::cout << "T5:Sleep end" << std::endl;
    printLock.unlock();

    printLock.lock();
    std::cout << "Task 5 end" << std::endl;
    printLock.unlock();

    return NULL;
  }
};

class Task4 : public task {
public:
  task* execute() {
    printLock.lock();
    std::cout << "Task 4 start" << std::endl;
    printLock.unlock();

    set_ref_count(1); // Create a child but do not want to wait

    task& u = *new (task::allocate_child()) Task5();
    task::spawn(u);

    // task::wait_for_all(); // Task 5 is asynchronous, just to print

    printLock.lock();
    std::cout << "Task 4 end" << std::endl;
    printLock.unlock();

    return NULL;
  }
};

class Task3 : public task {
public:
  task* execute() {
    printLock.lock();
    std::cout << "Task 3 start" << std::endl;
    printLock.unlock();

    set_ref_count(2);

    task& u = *new (task::allocate_child()) Task4();
    task::spawn(u);

    task::wait_for_all();

    printLock.lock();
    std::cout << "Task 3 end" << std::endl;
    printLock.unlock();

    return NULL;
  }
};

class Root1 : public task {
public:
  task* execute() {
    printLock.lock();
    std::cout << "Root1 start" << std::endl;
    printLock.unlock();

    set_ref_count(4);

    task& a = *new (task::allocate_child()) Task1();
    task& v = *new (task::allocate_child()) Task3();
    task& b = *new (task::allocate_child()) Task2();

    task::spawn(a);
    task::spawn(v);
    task::spawn(b);

    task::wait_for_all();

    printLock.lock();
    std::cout << "Root1 end" << std::endl;
    printLock.unlock();

    return NULL;
  }
};

int main() {
  task& v = *new (task::allocate_root()) Root1();
  task::spawn_root_and_wait(v);
  return EXIT_SUCCESS;
}

I expect to the see the following (or similar) output. Task 5 may start, may finish, it does not matter.

Root 1 start 
Task 2 start
Task 2 end
Task 3 start
Task 4 start
Task 4 end
Task 3 end
Task 5 start
T5:Sleep start
Task 1 start
Task 1 end
T5:Sleep end
Task 5 end
Root 1 end

However, the code runs possibly infinitely (?) with Tasks 3, 4, and 5 repeated.

Root 1 start
Task 2 start
Task 2 end
Task 3 start
Task 4 start
Task 4 end
Task 3 end
Task 5 start
T5:Sleep start
Task 1 start
Task 1 end
T5:Sleep end
Task 5 end
Task 4 start
Task 4 end
Task 3 start
Task 4 start
Task 4 end
Task 5 start
T5:Sleep start
Task 5 start
T5:Sleep start
Task 5 start
T5:Sleep start
T5:Sleep end
Task 5 end
Task 4 start
Task 4 end
Task 5 start
T5:Sleep start
T5:Sleep end
T5:Sleep end
Task 5 end
Task 5 start
Task 5 end
Task 4 start
Task 4 end
Task 5 start
T5:Sleep start
Task 5 start
T5:Sleep start
T5:Sleep start
^C

I do not understand the example fully, the error possibly is related to the manipulation of set_ref_count(). Any pointers to help understand the issue will help. Thanks.


Solution

  • The issue is near asynchronous task. When Task 5 finishes, it signals Task 4 to decrement its refcount that reaches zero and Task 4 is run again. So, there is a loop. As a result it leads to a race on refcount of Task 3 that causes an additional loop. For asynchronous pattern, you need to use continuation passing approach. However, I would recommend using several tbb::task_group algorithms to simply the synchronization issues.