Why standard doesn't require std::mutex::~mutex synchronizes-with with the latest unlock

struct X
{
    std::mutex m;
    std::string str;
    
    void set(std::string s) 
    {
        auto _ = std::unique_lock(m);
        str = std::move(s);
    }
    
    ~X()
    {
        // auto _ = std::unique_lock(m);
    }
}

Is there any part of the standard that guarantees that ~X will never experience race conditions inside ~string without the commented line?

The exclusive access to the object and/or lifetime could be managed by an atomic variable with RELAXED semantics (== no other data except the fact of the end of lifetime is synched). We have a mutex to protect the access to the object, so using relaxed operation seems to be ok to synchronize exclusive access/lifetime and mutex to access the data.

If the standard would require ~mutex to synchronize with the latest unlock and if we moved the declaration of the mutex below the protected data then we could afford default ~X but without this requirement, we need always have explicit destructor which locks all the mutexes used to protect any member.

PS to help understand my question use this example https://en.cppreference.com/w/cpp/thread/mutex there is a comment that accessing g_pages without a lock is safe. Why, which part of the standard guarantees this? The fact that both threads are joined only guarantees no multi-threaded access to the map, but it doesn't guarantee synchronize-with relationship with the latest mutex::unlock operation. I run many different programs trying to expose race conditions and I am pretty sure that mutex uses std::atomic_thread_fence for both lock and unlock which synchronizes with thread.join that must use at least some atomic to work. But the problem is standard doesn't require mutex to use std::atomic_thread_fence and it it just happens that all known to me implementations use it.

#include <chrono>
#include <iostream>
#include <map>
#include <mutex>
#include <string>
#include <thread>
 
std::map<std::string, std::string> g_pages;
std::mutex g_pages_mutex;
 
void save_page(const std::string &url)
{
    // simulate a long page fetch
    std::this_thread::sleep_for(std::chrono::seconds(2));
    std::string result = "fake content";
 
    std::lock_guard<std::mutex> guard(g_pages_mutex);
    g_pages[url] = result;
}
 
int main() 
{
    std::thread t1(save_page, "http://foo");
    std::thread t2(save_page, "http://bar");
    t1.join();
    t2.join();
 
    // safe to access g_pages without lock now, as the threads are joined
    for (const auto &pair : g_pages)
        std::cout << pair.first << " => " << pair.second << '\n';
}

PPS as was pointed out by @user17732522 the example above is guaranteed to be safe because of thread.join. You can remove mutex and one of the threads (just use 1 thread) and it will be safe. So I modified this example slightly to demonstrate the problem. Note that relaxed memory order is important here. If we replace it with a pair of acquire/release it will be perfectly thread-safe code, but it is also correct with relaxed if we assume std::mutex::unlock uses std::atomic_thread_fence

#include <atomic>
#include <chrono>
#include <iostream>
#include <map>
#include <mutex>
#include <string>
#include <thread>

std::map<std::string, std::string> g_pages;
std::mutex g_pages_mutex;
std::atomic_flag g_f1 = {};
std::atomic_flag g_f2 = {};

void save_page(const std::string& url, std::atomic_flag* flag)
{
    // simulate a long page fetch
    std::this_thread::sleep_for(std::chrono::seconds(2));
    std::string result = "fake content";

    {
        std::lock_guard<std::mutex> guard(g_pages_mutex);
        g_pages[url] = result;
    }
    flag->clear(std::memory_order_relaxed);
}

int main()
{
    g_f1.test_and_set();
    g_f2.test_and_set();

    std::thread t1(save_page, "http://foo", &g_f1);
    std::thread t2(save_page, "http://bar", &g_f2);

    while (g_f1.test_and_set(std::memory_order_relaxed));
    while (g_f2.test_and_set(std::memory_order_relaxed));

    // whether it safe to access g_pages without lock now, as the threads signaled they are done?
    for (const auto& pair : g_pages)
        std::cout << pair.first << " => " << pair.second << '\n';

    t1.join();
    t2.join();
}

Solution

Sort of. The standard does require you not to destroy an object who's lifetime has ended. If you run into the situation where multiple threads are trying to run X::~X() on the same object, you are already in UB land.

A similar argument can be made for one thread trying to ~X some x and another trying to x.set on the same object. If you ever run into this situation you are already violating object access rules for x. The UB stemming from races on x.str are a consequence of this.

So, you should never find yourself in the situation that you need a lock in the destructor of your owning X to begin with. If several threads are accessing the same x object, have it maybe be owned outside the thread pool.