Two Different Processes With 2 std::atomic Variables on Same Address?

I read C++ Standard (n4713)'s § 32.6.1 3:

Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes.

So it sounds like it is possible to perform a lock-free atomic operation on the same memory location. I wonder how it can be done.

Let's say I have a named shared memory segment on Linux (via shm_open() and mmap()). How can I perform a lockfree operation on the first 4 bytes of the shared memory segment for example?

At first, I thought I could just reinterpret_cast the pointer to std::atomic<int32_t>*. But then I read this. It first points out that std::atomic might not have the same size of T or alignment:

When we designed the C++11 atomics, I was under the misimpression that it would be possible to semi-portably apply atomic operations to data not declared to be atomic, using code such as

int x; reinterpret_cast<atomic<int>&>(x).fetch_add(1);

This would clearly fail if the representations of atomic and int differ, or if their alignments differ. But I know that this is not an issue on platforms I care about. And, in practice, I can easily test for a problem by checking at compile time that sizes and alignments match.

Tho, it is fine with me in this case because I use a shared memory on the same machine and casting the pointer in two different processes will "acquire" the same location. However, the article states that the compiler might not treat the casted pointer as a pointer to an atomic type:

However this is not guaranteed to be reliable, even on platforms on which one might expect it to work, since it may confuse type-based alias analysis in the compiler. A compiler may assume that an int is not also accessed as an atomic<int>. (See 3.10, [Basic.lval], last paragraph.)

Any input is welcome!

Solution

The C++ standard doesn't concern itself with multiple processes and no guarantees were given outside of a multi-threaded environment. However, the standard does recommend that implementations of lock-free atomics be usable across processes, which is the case in most real implementations. This answer will assume atomics behave more or less the same with processes as with threads.

The first solution requires C++20 atomic_ref

void* shared_mem = /* something */

auto p1 = new (shared_mem) int;  // For creating the shared object
auto p2 = (int*)shared_mem;      // For getting the shared object

std::atomic_ref<int> i{p2};      // Use i as if atomic<int>

You need to make sure the shared int has std::atomic_ref<int>::required_alignment alignment; typically the same as sizeof(int). Normally you'd use alignas() on a struct member or variable, but in shared memory the layout is up to you (relative to a known page boundary).

This prevents the presence of opaque atomic types existing in the shared memory, which gives you precise control over what exactly goes in there.

A solution prior C++20 would be

auto p1 = new (shared_mem) atomic<int>;  // For creating the shared object
auto p2 = (atomic<int>*)shared_mem;      // For getting the shared object

auto& i = *p2;

Or using C11 atomic_load and atomic_store

_Atomic int* i = (_Atomic int*)shared_mem;
atomic_store(i, 42);
int i2 = atomic_load(i);

Alignment requirements are the same here, alignof(std::atomic<int>) or _Alignof(atomic_int).