Search code examples
multithreadingconcurrencyc++17threadpoolatomic

Efficient way to check if we should exit a working thread


It's like a thread-pool context. I have a global atomic variable which indicates if working threads should exit:

atomic<bool> exit;

Such working threads are executed in a loop this way:

while (true) {
  std::unique_lock lock{mut};
  cv.wait(lock, [] {
    return !queue.empty() || exit.load();
  };

  if (exit.load()) {
    return;
  }

  // take task from queue
}

Is there any other efficient way (with better performance, lower latency) to check if we should exit from the working thread?


Solution

  • To minimize throughput cost of checking an exit_now flag, use .load(relaxed).

    That's extremely cheap if nothing in the same cache line is written frequently by any threads. i.e. with no false sharing, every thread can have the cache line in MESI Shared state and get L1d cache hits from loading it. (Or at least L2 or L3 hits, if you do a bunch of work between checks.)

    relaxed loads don't need any extra ordering or barriers, so they're as cheap as accessing a non-atomic global variable (in cases where it couldn't be optimized into a register for the loop duration.)

    Related: Why set the stop flag using `memory_order_seq_cst`, if you check it with `memory_order_relaxed`?


    But that just minimizes overhead of checking the exit flag.

    You could use a lock-free queue that blocks when empty, and check the exit flag after every item. I guess that would make exit latency a tradeoff with wasting throughput waking up even when there's no work in the queue just to check the flag, if you can't enqueue something after setting the exit flag to make sure readers wake up.

    Another option could be to cancel other threads, e.g. send POSIX signals, and have a signal handler which exits. That could wake a thread up to get it to exit even if it's in the middle of blocking to wait for a condition variable (or C++20 std:wait()).
    With C++20 there's std::jthread]2 which is cancellable, with C++17 and earlier you'd have to roll your own in a less-portable way.