Search code examples
jvmjava-native-interfacejvm-hotspot

The performance difference between java.lang.System and Unsafe


The System and Unsafe both offer some overlapped functionality ( For example, System.arraycopy v.s _UNSAFE.copyMemory).

In terms of implementations, it looks like both are relied on jni, is this a correct statement? (I could find unsafe.cpp but could not find the corresponding arraycopy implementation in JVM source code).

Also, if both are relied on JNI, could I say the invocation overhead to both of them are similar?

I know Unsafe could manipulate the offheap memory, but lets restrict our context on onheap memory here for the comparison.

Thanks for the answer.


Solution

  • Both System.arraycopy and Unsafe.copyMemory are HotSpot intrinsics. This means, JVM does not use JNI implementation when calling these methods from a JIT-compiled method. Instead, it replaces the call with an architecture-specific optimized assembly code.

    You may find the sources in stubGenerator_<arch>.cpp.

    Here is a simple JMH benchmark:

    import org.openjdk.jmh.annotations.Benchmark;
    import org.openjdk.jmh.annotations.Param;
    import org.openjdk.jmh.annotations.Scope;
    import org.openjdk.jmh.annotations.Setup;
    import org.openjdk.jmh.annotations.State;
    
    import java.util.concurrent.ThreadLocalRandom;
    
    import static one.nio.util.JavaInternals.byteArrayOffset;
    import static one.nio.util.JavaInternals.unsafe;
    
    @State(Scope.Benchmark)
    public class CopyMemory {
    
        @Param({"12", "123", "1234", "12345", "123456"})
        int size;
    
        byte[] src;
        byte[] dst;
    
        @Setup
        public void setup() {
            src = new byte[size];
            dst = new byte[size];
            ThreadLocalRandom.current().nextBytes(src);
        }
    
        @Benchmark
        public void systemArrayCopy() {
            System.arraycopy(src, 0, dst, 0, src.length);
        }
    
        @Benchmark
        public void unsafeCopyMemory() {
            unsafe.copyMemory(src, byteArrayOffset, dst, byteArrayOffset, src.length);
        }
    }
    

    It confirms the performance of both methods is similar:

    Benchmark                    (size)  Mode  Cnt     Score    Error  Units
    CopyMemory.systemArrayCopy       12  avgt   16     5.294 ±  0.162  ns/op
    CopyMemory.systemArrayCopy      123  avgt   16     7.057 ±  0.406  ns/op
    CopyMemory.systemArrayCopy     1234  avgt   16    18.761 ±  0.492  ns/op
    CopyMemory.systemArrayCopy    12345  avgt   16   353.386 ±  3.627  ns/op
    CopyMemory.systemArrayCopy   123456  avgt   16  5234.125 ± 57.914  ns/op
    CopyMemory.unsafeCopyMemory      12  avgt   16     5.028 ±  0.120  ns/op
    CopyMemory.unsafeCopyMemory     123  avgt   16     8.055 ±  0.405  ns/op
    CopyMemory.unsafeCopyMemory    1234  avgt   16    19.776 ±  0.523  ns/op
    CopyMemory.unsafeCopyMemory   12345  avgt   16   353.549 ±  5.878  ns/op
    CopyMemory.unsafeCopyMemory  123456  avgt   16  5246.298 ± 65.427  ns/op
    

    If you run this JMH benchmark with -prof perfasm profiler, you'll see both methods boil down to exactly the same assembly loop:

    # systemArrayCopy
    
      0.64%   ↗   0x00007fa95d4336d0:   vmovdqu -0x38(%rdi,%rdx,8),%ymm0
      2.81%   │   0x00007fa95d4336d6:   vmovdqu %ymm0,-0x38(%rsi,%rdx,8)
      5.67%   │   0x00007fa95d4336dc:   vmovdqu -0x18(%rdi,%rdx,8),%ymm1
     69.64%   │   0x00007fa95d4336e2:   vmovdqu %ymm1,-0x18(%rsi,%rdx,8)
     15.28%   │   0x00007fa95d4336e8:   add    $0x8,%rdx
              ╰   0x00007fa95d4336ec:   jle    Stub::jbyte_disjoint_arraycopy+112 0x00007fa95d4336d0
    
    # unsafeCopyMemory
      
      1.08%   ↗   0x00007f2d39833af0:   vmovdqu -0x38(%rdi,%rdx,8),%ymm0
      3.09%   │   0x00007f2d39833af6:   vmovdqu %ymm0,-0x38(%rcx,%rdx,8)
      5.78%   │   0x00007f2d39833afc:   vmovdqu -0x18(%rdi,%rdx,8),%ymm1
     66.44%   │   0x00007f2d39833b02:   vmovdqu %ymm1,-0x18(%rcx,%rdx,8)
     19.00%   │   0x00007f2d39833b08:   add    $0x8,%rdx
              ╰   0x00007f2d39833b0c:   jle    Stub::jlong_disjoint_arraycopy+48 0x00007f2d39833af0
    

    When working with regular arrays in Java heap, there is absolutely no need to use Unsafe API. The standard System.arraycopy is very well optimized. JDK class library itself uses System.arraycopy pretty much everywhere, including StringBuilder, ArrayList, ByteArrayOutputStream, etc.