Search code examples
c++atomicmemory-barriersstdatomicthread-sanitizer

Why does the thread sanitizer complain about acquire/release thread fences?


I'm learning about different memory orders.

I have this code, which works and passes GCC's and Clang's thread sanitizers:

#include <atomic>
#include <iostream>
#include <future>
    
int state = 0;
std::atomic_int a = 0;

void foo(int from, int to) 
{
    for (int i = 0; i < 10; i++)
    {
        while (a.load(std::memory_order_acquire) != from) {}
        state++;
        a.store(to, std::memory_order_release);
    }
}

int main()
{    
    auto x = std::async(std::launch::async, foo, 0, 1);
    auto y = std::async(std::launch::async, foo, 1, 0);
}

I reckon that an 'acquire' load is unnecessary if it doesn't end up returning from, so I decided to use a 'relaxed' load, followed by an 'acquire' fence.

I expected it to work, but it's rejected by the thread sanitizers, which claim that concurrent state++s are a data race.

#include <atomic>
#include <iostream>
#include <future>
    
int state = 0;
std::atomic_int a = 0;

void foo(int from, int to) 
{
    for (int i = 0; i < 10; i++)
    {
        while (a.load(std::memory_order_relaxed) != from) {}
        std::atomic_thread_fence(std::memory_order_acquire);
        state++;
        a.store(to, std::memory_order_release);
    }
}

int main()
{    
    auto x = std::async(std::launch::async, foo, 0, 1);
    auto y = std::async(std::launch::async, foo, 1, 0);
}

Why is this a data race?

Cppreference says that

Atomic-fence synchronization

An atomic release operation X in thread A synchronizes-with an acquire fence F in thread B, if

  • there exists an atomic read Y (with any memory order)
  • Y reads the value written by X (or by the release sequence headed by X)
  • Y is sequenced-before F in thread B

In this case, all non-atomic and relaxed atomic stores that are sequenced-before X in thread A will happen-before all non-atomic and relaxed atomic loads from the same locations made in thread B after F.

In my understanding, all conditions are met:

  • "there exists an atomic read Y (with any memory order)" — check: a.load(std::memory_order_relaxed).
  • "Y reads the value written by X" — check, it reads the value from a.store(to, std::memory_order_release);.
  • "Y is sequenced-before F in thread B" — check.

Solution

  • The thread sanitizer currently doesn't support std::atomic_thread_fence. (GCC and Clang use the same thread sanitizer, so it applies to both.)

    GCC 12 (currently trunk) warns about it:

    atomic_base.h:133:26: warning: 'atomic_thread_fence' is not supported with '-fsanitize=thread' [-Wtsan]
      133 |   { __atomic_thread_fence(int(__m)); }
          |     ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
    

    To stop the sanitizer from disregarding fences, you can manually instrument them, using __tsan_acquire and __tsan_release.

    #include <sanitizer/tsan_interface.h>

    while (a.load(std::memory_order_relaxed) != from) {}
    __tsan_acquire(&a); // <--
    std::atomic_thread_fence(std::memory_order_acquire);
    

    I assume it's tricky to automatically determine which atomic variables are affected by the fence.

    Even though the bug report says seq_cst fences are not affected, the code is still rejected if I use such a fence, they still need to be annotated with __tsan_acquire+__tsan_release, exactly the same as acq-rel fences.