Search code examples
c++winapisynchronizationspinlockintel-inspector

Intel Inspector reports a data race in my spinlock implementation


I made a very simple spinlock using the Interlocked functions in Windows and tested it on a dual-core CPU (two threads that increment a variable);

The program seems to work OK (it gives the same result every time, which is not the case when no synchronization is used), but Intel Parallel Inspector says that there is a race condition at value += j (see the code below). The warning disappears when using Critical Sections instead of my SpinLock.

Is my implementation of SpinLock correct or not ? It's really strange, because all the used operations are atomic and have the proper memory barriers and it shouldn't lead to race conditions.

class SpinLock
{
   int *lockValue;
   SpinLock(int *value) : lockValue(value) { }

   void Lock() {
      while(InterlockedCompareExchange((volatile LONG*)lockValue, 1, 0) != 0) {
          WaitABit();
      }
   }

   void Unlock() { InterlockedExchange((volatile LONG*)lockValue, 0); }
};

The test program:

static const int THREADS = 2;
HANDLE completedEvents[THREADS];
int value = 0;
int lock = 0; // Global.

DWORD WINAPI TestThread(void *param) {
    HANDLE completed = (HANDLE)param;
    SpinLock testLock(&lock);

    for(int i = 0;i < 1000*20; i++) {
        for(int j = 0;j < 10*10; j++) {
            // Add something to the variable.
            testLock.Lock();
            value += j;
            testLock.Unlock();
        }
    }
    SetEvent(completed);
}

int main() {
   for(int i = 0; i < THREADS; i++) {
        completedEvents[i] = CreateEvent(NULL, true, false, NULL);
   }
   for(int i = 0; i < THREADS; i++) {
        DWORD id;
        CreateThread(NULL, 0, TestThread, completedEvents[i], 0, &id);
   }

   WaitForMultipleObjects(THREADS, completedEvents, true, INFINITE);
   cout<<value;
}

Solution

  • Parallel Inspector's documentation for data race suggests using a critical section or a mutex to fix races on Windows. There's nothing in it which suggests that Parallel Inspector knows how to recognise any other locking mechanism you might invent.

    Tools for analysis of novel locking mechanisms tend to be static tools which look at every possible path through the code, Parallel Inspector's documentation implies that it executes the code once.

    If you want to experiment with novel locking mechanisms, the most common tool I've seen used in academic literature is the Spin model checker. There's also ESP, which might reduce the state space, but I don't know if it's been applied to concurrent problems, and also the mobility workbench which would give an analysis if you can couch your problem in pi-calculus. Intel Parallel Inspector doesn't seem anything like as complicated as these tools, but rather designed to check for commonly occurring issues using heuristics.