If a data structure has multiple elements in it, the atomic version of it cannot (always) be lock-free. I was told that this is true for larger types because the CPU can not atomically change the data without using some sort of lock.
for example:
#include <iostream>
#include <atomic>
struct foo {
double a;
double b;
};
std::atomic<foo> var;
int main()
{
std::cout << var.is_lock_free() << std::endl;
std::cout << sizeof(foo) << std::endl;
std::cout << sizeof(var) << std::endl;
}
the output (Linux/gcc) is:
0
16
16
Since the atomic and foo
are the same size, I don't think a lock is stored in the atomic.
My question is:
If an atomic variable uses a lock, where is it stored and what does that mean for multiple instances of that variable ?
The easiest way to answer such questions is generally to just look at the resulting assembly and take it from there.
Compiling the following (I made your struct larger to dodge crafty compiler shenanigans):
#include <atomic>
struct foo {
double a;
double b;
double c;
double d;
double e;
};
std::atomic<foo> var;
void bar()
{
var.store(foo{1.0,2.0,1.0,2.0,1.0});
}
In clang 5.0.0 yields the following under -O3: see on godbolt
bar(): # @bar()
sub rsp, 40
movaps xmm0, xmmword ptr [rip + .LCPI0_0] # xmm0 = [1.000000e+00,2.000000e+00]
movaps xmmword ptr [rsp], xmm0
movaps xmmword ptr [rsp + 16], xmm0
movabs rax, 4607182418800017408
mov qword ptr [rsp + 32], rax
mov rdx, rsp
mov edi, 40
mov esi, var
mov ecx, 5
call __atomic_store
Great, the compiler delegates to an intrinsic (__atomic_store
), that's not telling us what's really going on here. However, since the compiler is open source, we can easily find the implementation of the intrinsic (I found it in https://github.com/llvm-mirror/compiler-rt/blob/master/lib/builtins/atomic.c):
void __atomic_store_c(int size, void *dest, void *src, int model) {
#define LOCK_FREE_ACTION(type) \
__c11_atomic_store((_Atomic(type)*)dest, *(type*)dest, model);\
return;
LOCK_FREE_CASES();
#undef LOCK_FREE_ACTION
Lock *l = lock_for_pointer(dest);
lock(l);
memcpy(dest, src, size);
unlock(l);
}
It seems like the magic happens in lock_for_pointer()
, so let's have a look at it:
static __inline Lock *lock_for_pointer(void *ptr) {
intptr_t hash = (intptr_t)ptr;
// Disregard the lowest 4 bits. We want all values that may be part of the
// same memory operation to hash to the same value and therefore use the same
// lock.
hash >>= 4;
// Use the next bits as the basis for the hash
intptr_t low = hash & SPINLOCK_MASK;
// Now use the high(er) set of bits to perturb the hash, so that we don't
// get collisions from atomic fields in a single object
hash >>= 16;
hash ^= low;
// Return a pointer to the word to use
return locks + (hash & SPINLOCK_MASK);
}
And here's our explanation: The address of the atomic is used to generate a hash-key to select a pre-alocated lock.