Search code examples
javamemory-layoutmappedbytebufferproject-panama

Java19 Foreign Memory - Read Varint from a MemorySegment


I need to read from a large memory mapped file and as we know, ByteBuffer suffer from many limitations, like the 2GB size limit and developers are unable to deallocate a MemoryMapped file. I was investigating MemorySegment which aims to solve all those issue.

My file contains many Variable integers elements which are easy to read and write with a ByteBuffer using the following methods:

public static int getVarInt(ByteBuffer src) {
    int tmp;
    if ((tmp = src.get()) >= 0) {
        return tmp;
    }
    int result = tmp & 0x7f;
    if ((tmp = src.get()) >= 0) {
        result |= tmp << 7;
    } else {
        result |= (tmp & 0x7f) << 7;
        if ((tmp = src.get()) >= 0) {
            result |= tmp << 14;
        } else {
            result |= (tmp & 0x7f) << 14;
            if ((tmp = src.get()) >= 0) {
                result |= tmp << 21;
            } else {
                result |= (tmp & 0x7f) << 21;
                result |= (tmp = src.get()) << 28;
                while (tmp < 0) {
                    tmp = src.get();
                }
            }
        }
    }
    return result;
}

It's also possible to read an INT or LONG from any position of the ByteBuffer.

A MemoryLayout doesn't seem to be helpful here as the size of the struct is fixed.

Moreover, if I have to read an Int that is not align to 4 bytes, MemorySegment throws a very nasty exception.

MemorySegment segment = MemorySegment.allocateNative(1024, MemorySession.global());

segment.set(ValueLayout.JAVA_INT, 0, 10);
// You can't read from position 3 even if you slice the memory segment :(
var elem = segment.asSlice(3,4).get(ValueLayout.JAVA_INT, 0);
java.lang.IllegalArgumentException: Misaligned access at address: 5066757123

Is there any efficient way to read a structure with many variable integers, and integers that are not aligned to 4 bytes?


Solution

  • It's difficult to say why the memory alignment behavior was put into the Foreign Function & Memory API in the first place. In its current form, it's confusing and more of an obstacle than help.

    Fortunately, you can turn it off:

    var UNALIGNED_INT = ValueLayout.JAVA_INT.withBitAlignment(8);
    MemorySegment segment = MemorySegment.allocateNative(1024, MemorySession.global());
    
    var elem = segment.get(UNALIGNED_INT, 3);
    System.out.println(elem);
    

    Note that it will only run if the underlying processor can access unaligned memory and is configured to do so. As far as I know, this is the case for Windows (x86-64), Linux (x86-64 and ARM64) and macOS (x86-64 and ARM64). It's also the case for many 32-bit system but they are not supported by the Foreign Function & Memory API.