I'm using tbb
compare_and_swap
operation in Xeon Phi in a lock-free algorithm. Since Xeon Phi is an in order machine, it doesn't support sfence
operation. So will the atomic operations work correctly on Xeon Phi?
Yes, they definitely work correctly, most of TBB itself is based on the atomic operations. And sfence
is not required for atomic operations to work correctly, it's a standalone memory barrier while atomic operations imply memory barriers themselves. TBB doesn't use sfence
even on the regular Xeons, it uses mfence
for full memory fence instead. For Xeon Phi, it is substituted by no-op atomic operation, e.g. mic_common.h of TBB contains the following definitions:
/** Intel(R) Many Integrated Core Architecture does not support mfence and pause instructions **/
#define __TBB_full_memory_fence() __asm__ __volatile__("lock; addl $0,(%%rsp)":::"memory")
#define __TBB_Pause(x) _mm_delay_32(16*(x))