Search code examples
c++treetraversaltbb

TBB task_groups without using stack


I would like to perform a post-order tree traversal in C++. The tree can be so deep that I cannot use recursion (it exhausts the stack). Instead I create an std::stack<...> that puts everything on the heap, and traverse the tree in a while loop rather than with function calls. This works great.

Now I would like to parallelize the whole process using TBB. I was thinking of creating a task_group at every node, and running the same functor on each of its children. But it occurs to me that this would run into the same problem with the tree depth I had before: running the functor on each node of the deepest path will eat away something from the stack until the whole thing runs out.

Is there a way out of this problem? Or am I imagining the whole thing; is there some magic behind task_group::wait() that avoids this problem?


Solution

  • From TBB documentation (about implicit continuation):

    Because the parent blocks, its thread’s stack cannot be popped yet. The thread must be careful about what work it takes on, because continually stealing and blocking could cause the stack to grow without bound.

    This is not exactly about that, but it shows that TBB is not using any stack magic to empty stack for currently blocked tasks. This means that you would get stack-overflows a bit later (spreaded among multiple thread's stacks) when using implicit continuation.

    Using explicit continuation might solve the problem, but that depends heavily on internal TBB implementation of thread scheduler (which is not documented). There is a fair chance that it would work correctly - the only way to know is to either look through TBB sources amd see how task are handled, or write simple test program with small stack and look if something simple will be able to exhaust it.