The following code fails on the second assert statement at the 20th iteration - note I'm just recreating my code that caused the issue; the count is not relevant, rather the number of bytes written is.
SingleChronicleQueue writer = SingleChronicleQueueBuilder.binary("/tmp/broken").build();
ExcerptAppender excerptAppender = writer.acquireAppender();
try(DocumentContext dc = excerptAppender.writingDocument())
{
dc.wire().bytes().writeSkip(36);
}
for(int i = 0; i < 20; i++)
{
try (DocumentContext dc = excerptAppender.writingDocument())
{
dc.wire().bytes().writeSkip(14);
}
}
SingleChronicleQueue reader = SingleChronicleQueueBuilder.binary("/tmp/broken").build();
ExcerptTailer tailer = reader.createTailer();
try(DocumentContext dc = tailer.readingDocument())
{
assert dc.isPresent() && dc.wire().bytes().readRemaining() == 36;
}
for(int i = 0; i < 20; i++)
{
try(DocumentContext dc = tailer.readingDocument())
{
//Fails on the 20th read .. with 16 bytes being returned
assert dc.isPresent() && dc.wire().bytes().readRemaining() == 14;
}
}
The issue appears to be in the SingleChronicleQueueExcerpts class where padding is added to the message to cache align it to 64 bytes. I wasn't anticipating having to add my own message lengths to my writes, but it doesn't seem avoidable if chronicle-queue is not padding it's own header to the cache line boundary.
thanks in advance
The problem this is trying to work around is that CAS operations are not actually atomic across cache lines !! On ARM it just gets a SIGBUS but on x64 it just works 99.999% of the time.
This was discovered by us after the format was being used by customers so we ended up with this work around. The next major version should fix this. I suggest adding a stop bit encoded length to the start which will only be a byte or two.