c++c multithreading shared-memory atomic

How do atomic variables based on shared memory work in inter-process contexts?

Let's say a process creates a piece of shared memory the size of 2 integer (64-bit/8bytes).

The shared memory will be available to not only threads of the process, but other processes on the system that have access to that piece of shared memory.

Presumably the shared memory will in the first process will be addressed via a virtual address space, so when an atomic operation (cmp exchange) is performed on the 1 integer, the virtual address in the context of the first processed is used.

If another process at the same time is performing some kind of atomic operation on the first integer, it would also be using its own virtual address space.

So what system actually performance the translations into the actual physical address, and from a very general POV how does the CPU provide atomicity guarantees in this situation?

Solution

Modern CPU caches operate on physical addresses (usually caches are virtually tagged physically indexed). Basically this means that two virtual addresses in two different processes translated to the same physical address will be cached just once per CPU.
Modern CPU caches are coherent: the cache is synchronized among all CPUs in the system, so all CPUs have the identical data in their caches. On Intel CPUs usually the MESI protocol is used.
Modern CPUs have write buffers, so a memory store takes some time to get to the cache.

So, from a very general point of view, an atomic operation on modern CPU basically reads and locks a cache line for an exclusive use of the CPU until the atomic operation is done and propagates the changes directly to the cache, avoiding buffering within the CPU.