I try to implement something like lock-free fixedsize queue in D language
import core.atomic;
struct Chunk(T, uint N)
{
T[N] data;
shared uint count_;
shared uint queueCounter;
@property bool full() { return count_ == N; }
void append(T value)
{
atomicOp!("+=")(queueCounter, 1);
while(1)
{
uint c = count_;
if(cas(&count_, c, c + 1))
{
data[c] = value;
atomicOp!("-=")(queueCounter, 1);
break;
}
}
}
bool wait()
{
if(!full())
{
return false;
}
while(0 != queueCounter) {}
return true;
}
}
Call it like:
import std.parallelism;
struct S
{
bool dirty;
int time;
int[16] data;
}
int main(string[] argv)
{
const uint N = 14343;
Chunk!(S, N) ch;
foreach(i; taskPool.parallel(std.range.iota(N), 10))
{
S item;
item.time = i;
ch.append(item);
}
while(!ch.wait()) {}
// DONE
return 0;
}
It works fine with N == 14343
, but fails without any message with 14344 (value depends on S.sizeof
).
Why is program fail?
Am I doing correct CAS append?
Is chunk.data
fully accessible after the "DONE" string?
It looks like you're running it on Windows, where the default stack size is 1 MB (at least according to this article at MSDN).
Your S.sizeof
is probably 72, which gives no more than 14563 instances of S
(there are also other things on the stack, hence your maximum N
is slightly lower).
Placing a larger variable on the stack causes a stack overflow, which should occur as soon as main is called: ch
is then assigned the value of Chunk!(S, N).init
which causes a write outside stack bounds, hitting a guard page and consequently crashing the program with a segmentation fault (at least this is the result on Linux when N
is large enough to overflow the default 8-megabyte stack), or, in Windows terminology, an access violation (I don't have a Windows box right now to verify it).
There are a few solutions:
N
.data
on the heap.