Search code examples
cpux86-64atomicinstructionsmesi

If write operation happens during exclusive cache access why is there data race?


I was reading about MESI protocol and cannot understand why there is data race if we have exclusive access on every write operation which consequently invalidated cache lines in other cores' caches? in this example:

CYCLE # CORE 1                        CORE 2
0   reg = load(&counter);   
1   reg = reg + 1;                reg = load(&counter);
2   store(&counter, reg);         reg = reg + 1;
3                                 store(&counter, reg);

it's said that the overall result is that the variable is only incremented once while both cores try to increment it (result is expected to be two). So the question is if during writing operation both cores request exclusive access to the cache line (and thus other cores "wait" their turn to modify and thus get also exclusive access) why there is data race on that variable?


Solution

  • If I read the matter right, MESI is just a red herring here:

    0   reg = load(&counter);   
    

    counter has now been loaded into CPU's register.

    1   reg = reg + 1;                reg = load(&counter);
    

    First processor incremented the value, second one loaded the old one.

    2   store(&counter, reg);         reg = reg + 1;
    

    First processor stores value, second one increments its outdated one.

    3                                 store(&counter, reg);
    

    Second processor stores the result of calculation based on an outdated value.

    Should be clear so far. Now how will that change if adding MESI states:

    0   reg = load(&counter);   
    

    counter is in CPU 1 cache, marked E.

    1   reg = reg + 1;                reg = load(&counter);
    

    counter still resides in CPU 1 cache, but is loaded into CPU 2 cache, too. So both cache lines need to be marked as S

    2   store(&counter, reg);         reg = reg + 1;
    

    Now counter gets stored back in cache. CPU 1 cache thus needs to be marked as M, while CPU 2 cache gets invalidated (marked as I).

    3                                 store(&counter, reg);
    

    As CPU 2 cache got invalidated, it needs to be updated before the store operation can occur, which, in turn, requires CPU 1 cache to be written back to memory before (of course).

    But all that being done now, the value in reg still was calculated based on an outdated value and still overwrites the (now updated) value in cache...

    Adding a final detail: After the operation, CPU 2 cache will be marked M and CPU 1 cache I.