Search code examples
c++lock-free

hardware support for atomic fetch-and-add vs fetch-and-or


It seems fetch_add is a win (see post comments as well) against a CAS loop on CPUs that support both.

When toggling clear bit(s) to set bit(s) you can use either a bitwise or or an addition operation. The results will be identical. I expect performance for each would be equal. So the decision on which operation to use would hinge on the differences in hardware support for the operations (if any, I failed to turn up any information on relative processor support.)

Is there any reason to prefer one over the other in this case?


Solution

  • What you might want to do, instead of coding for a specific processor architecture, is to use a compiler intrinsic. GCC and Clang, for example, support several atomic builtins, one of which is __sync_fetch_and_or.

    Since Visual Studio 2005, Visual C++ has supported _InterlockedOr on all architectures.