Search code examples
dcaslock-free

fail with lock-free fixedsize queue


I try to implement something like lock-free fixedsize queue in D language

import core.atomic;
struct Chunk(T, uint N)
{
    T[N] data;
    shared uint count_;
    shared uint queueCounter;

    @property bool full() { return count_ == N; }

    void append(T value)
    {
        atomicOp!("+=")(queueCounter, 1);
        while(1)
        {
            uint c = count_;
            if(cas(&count_, c, c + 1))
            {
                data[c] = value;
                atomicOp!("-=")(queueCounter, 1);
                break;
            }
        }       
    }

    bool wait()
    {
        if(!full())
        {
            return false;
        }

        while(0 != queueCounter) {}

        return true;
    }
}

Call it like:

import std.parallelism;

struct S
{
    bool dirty;
    int time;
    int[16] data;
}

int main(string[] argv)
{
    const uint N = 14343;

    Chunk!(S, N) ch;


    foreach(i; taskPool.parallel(std.range.iota(N), 10))
    {
        S item;
        item.time = i;
        ch.append(item);
    }
    while(!ch.wait()) {}

    // DONE

    return 0;
}

It works fine with N == 14343, but fails without any message with 14344 (value depends on S.sizeof).

Why is program fail?

Am I doing correct CAS append?

Is chunk.data fully accessible after the "DONE" string?


Solution

  • It looks like you're running it on Windows, where the default stack size is 1 MB (at least according to this article at MSDN).

    Your S.sizeof is probably 72, which gives no more than 14563 instances of S (there are also other things on the stack, hence your maximum N is slightly lower).

    Placing a larger variable on the stack causes a stack overflow, which should occur as soon as main is called: ch is then assigned the value of Chunk!(S, N).init which causes a write outside stack bounds, hitting a guard page and consequently crashing the program with a segmentation fault (at least this is the result on Linux when N is large enough to overflow the default 8-megabyte stack), or, in Windows terminology, an access violation (I don't have a Windows box right now to verify it).

    There are a few solutions:

    1. Use smaller N.
    2. Increase stack size (see the article linked above).
    3. Allocate data on the heap.