What are the design purposes or technical restrictions that make the return value of std::fetch_add
is the one before changed?
It's not a big deal either way, you can emulate one in terms of the other. e.g. val.add_fetch(1)
can be implemented with 1 + val.fetch_add(1)
if you want it. However, GNU C __atomic
builtins provide both.
Possible reason for ISO C/C++ only providing fetch_add
instead of add_fetch
: it makes it cheaper to implement on x86 in some cases; lock xadd [mem], reg
leaves reg = old value of mem, mem = sum. Providing that primitive instead of the other encourages people to design algorithms around that building block, maybe avoiding the need for an extra add
instruction.
Most RISC ISAs with LL/SC atomics have 3-operand instructions, so they could add dst, src1, src2
and leave the value from memory undisturbed in another register if code wants it later. (LL/SC fetch_add(x)
would normally be implemented as load-linked reg1, [mem]
/ add reg2, reg1, x
/ store-conditional reg3, reg2, [mem]
. With a retry loop based on the success/fail result in reg3. If the fetch_add
return value is unused, the add
can overwrite reg1
instead of using a new reg.
So on most RISCs it's fine either way, and x86 is one of the more-relevant ISAs to care about efficiency on.
For some use-cases, fetch_add
is also what you want. e.g. for threads grabbing buckets in an array-based circular buffer lock-free queue, with std::atomic<unsigned> write_idx;
zero-initialized to start with, you want .fetch_add
to start with 0
.
static std::atomic<unsigned> write_idx = 0; // shared var
// in each thread:
unsigned my_buf = write_idx.fetch_add(1) & ((1<<size) - 1);
You'll get values starting with 0
instead of 1
. This seems like a plausible pattern for lots of use-cases.