Search code examples
javajava-ffm

How can I use VarHandle to perform a compare-and-swap with memory allocated by SegmentAllocator or sun.misc.Unsafe?


In our codebase we have an array-like datastructure class that allocates memory with sun.misc.Unsafe so it can accept a long as a size & index parameter. This class also uses the sun.misc.Unsafe function compareAndSwapLong to perform compare-and-swap (CAS) operations against elements of the array. This class produces un-silenceable warnings when compiled and per JEP 471 the functionality in sun.misc.Unsafe is now deprecated, so we are looking to update it to use Java's newer unsafe APIs.

Java's VarHandle class was introduced in JEP 193/Java 9 and is touted as a safe way to perform low-overhead CAS operations. However, it only appears to operate on existing safe datastructures; for example, to CAS an element of an array the docs show this example:

String[] sa = ...
VarHandle avh = MethodHandles.arrayElementVarHandle(String[].class);
boolean r = avh.compareAndSet(sa, 10, "expected", "new");

the docs then specify that the compareAndSet call accepts an int in the index position. Looking through the VarHandle APIs, nothing jumps out as obviously exposing a CAS operation against memory allocated using either the old sun.misc.Unsafe method or the new SegmentAllocator interface introduced in JEP 454/Java 22. Where is this functionality exposed, if anywhere?


Solution

  • Use MemoryLayout::arrayElementVarHandle(PathElement...) to create a VarHandle that can access a segment like an array with long indices.

    Creates a var handle that accesses adjacent elements in a memory segment at offsets selected by the given layout path, where the accessed elements have this layout, and where the initial layout in the path is this layout. The returned var handle has the following characteristics:

    • its type is derived from the carrier of the selected value layout;
    • it has a leading parameter of type MemorySegment representing the accessed segment
    • a following long parameter, corresponding to the base offset, denoted as B;
    • a following long parameter, corresponding to the array index, denoted as I0. The array index is used to scale the accessed offset by this layout size;
    • it has zero or more trailing access coordinates of type long, one for each open path element in the provided layout path, denoted as I1, I2, ... In, respectively. The order of these access coordinates corresponds to the order in which the open path elements occur in the provided layout path.

    [...]

    For example:

    import java.lang.foreign.Arena;
    import java.lang.foreign.ValueLayout;
    import java.lang.invoke.MethodHandles;
    
    public class Main {
    
      private static final long LENGTH = 10L;
    
      public static void main(String[] args) {
        try (var arena = Arena.ofConfined()) {
          var segment = arena.allocate(ValueLayout.JAVA_INT, LENGTH);
          var handle = ValueLayout.JAVA_INT.arrayElementVarHandle();
          /*
           * This binds the MemorySegment (first argument) and offset (second argument) to
           * 'segment' and '0L', respectively. This makes it so the MemorySegment and offset
           * don't have to be passed every time a method is invoked on the VarHandle. For
           * instance, this:
           *
           *     handle.set(segment, 0L, 5L, 42)
           *
           * Can now just be:
           *
           *     handle.set(5L, 42)
           *
           * Note if you want to use the VarHandle with multiple MemorySegment arrays, then do
           * not bind the MemorySegment argument. You can still bind the offset argument with:
           *
           *     MethodHandles.insertCoordinates(handle, 1, 0L)
           *
           * The MemorySegment is bound in this example in an attempt to make it more readable.
           */
          handle = MethodHandles.insertCoordinates(handle, 0, segment, 0L);
    
          // populate array so not all elements are 0
          for (long i = 0; i < LENGTH; i++) {
            handle.set(i, (int) i * 2);
          }
    
          // print array
          for (long i = 0; i < LENGTH; i++) {
            int value = (int) handle.get(i);
            System.out.printf("[%d] = %d%n", i, value);
          }
    
          // demonstrate using compareAndSet
          System.out.println();
          System.out.println("cas(3, 9, 117): " + handle.compareAndSet(3L, 9, 117)); // will fail
          System.out.println("cas(5, 10, 42): " + handle.compareAndSet(5L, 10, 42)); // will succeed
          System.out.println();
    
          // print array to show state after compareAndSet
          for (long i = 0; i < LENGTH; i++) {
            int value = (int) handle.get(i);
            System.out.printf("[%d] = %d%n", i, value);
          }
        }
      }
    }
    

    Note: In real code you'll probably want to store the VarHandle in a static final field to improve performance. Which means binding a particular memory segment as the first argument is likely not an option.

    Output:

    [0] = 0
    [1] = 2
    [2] = 4
    [3] = 6
    [4] = 8
    [5] = 10
    [6] = 12
    [7] = 14
    [8] = 16
    [9] = 18
    
    cas(3, 9, 117): false
    cas(5, 10, 42): true
    
    [0] = 0
    [1] = 2
    [2] = 4
    [3] = 6
    [4] = 8
    [5] = 42
    [6] = 12
    [7] = 14
    [8] = 16
    [9] = 18
    

    However, note the access mode restrictions:

    Access mode restrictions

    A var handle returned by varHandle(PathElement...) or ValueLayout.varHandle() features certain access characteristics, which are derived from the selected layout L:

    • A carrier type T, derived from L.carrier()
    • An alignment constraint A, derived from L.byteAlignment()
    • An access size S, derived from L.byteSize()

    Depending on the above characteristics, the returned var handle might feature certain access mode restrictions. We say that a var handle is aligned if its alignment constraint A is compatible with the access size S, that is if A >= S. An aligned var handle is guaranteed to support the following access modes:

    • read write access modes for all T. On 32-bit platforms, access modes get and set for long, double and MemorySegment are supported but might lead to word tearing, as described in Section 17.7. of The Java Language Specification.
    • atomic update access modes for int, long, float, double and MemorySegment. (Future major platform releases of the JDK may support additional types for certain currently unsupported access modes.)
    • numeric atomic update access modes for int, long and MemorySegment. (Future major platform releases of the JDK may support additional numeric types for certain currently unsupported access modes.)
    • bitwise atomic update access modes for int, long and MemorySegment. (Future major platform releases of the JDK may support additional numeric types for certain currently unsupported access modes.)

    If T is float, double or MemorySegment then atomic update access modes compare values using their bitwise representation (see Float.floatToRawIntBits(float), Double.doubleToRawLongBits(double) and MemorySegment.address(), respectively).

    Alternatively, a var handle is unaligned if its alignment constraint A is incompatible with the access size S, that is, if A < S. An unaligned var handle only supports the get and set access modes. All other access modes will result in UnsupportedOperationException being thrown. Moreover, while supported, access modes get and set might lead to word tearing.

    The documentation of VarHandle explains which methods belong to each of the five access "groups" (i.e., "read", "write", "atomic update", "numeric atomic update", and "bitwise atomic update").

    This means methods like compareAndSet will throw an UnsupportedOperationException (at least in Java 22) for value layouts such as JAVA_BYTE. Which means, as far as I can tell, you won't be able to create a direct analog of byte[] / ByteBuffer that uses long indices.