Search code examples
javaperformanceassemblyjvmprofiling

LinuxPerfAsmProfiler shows Java code corresponding assembly hot spot for Java 8, but not for Java 14


When investigating an issue related to instantiation of Spring's org.springframework.util.ConcurrentReferenceHashMap (as of spring-core-5.1.3.RELEASE) I've used LinuxPerfAsmProfiler shipped along with JMH to profile generated assembly.

I simply run this

@Benchmark
public Object measureInit() {
  return new ConcurrentReferenceHashMap<>();
}

Benchmarking on JDK 8 allows to identify one of non-obvious hot spots:

  0.61%        0x00007f32d92772ea: lock addl $0x0,(%rsp)     ;*putfield count
                                                             ; - org.springframework.util.ConcurrentReferenceHashMap$Segment::&lt;init&gt;@11 (line 476)
                                                             ; - org.springframework.util.ConcurrentReferenceHashMap::&lt;init&gt;@141 (line 184)
 15.81%        0x00007f32d92772ef: mov    0x60(%r15),%rdx

This corresponds unnecessary assignment of default value to a volatile field:

protected final class Segment extends ReentrantLock {
  private volatile int count = 0;
}

and Segment is in turn instantiated in loop in constructor of CCRHM:

public ConcurrentReferenceHashMap(
    int initialCapacity, float loadFactor, int concurrencyLevel, ReferenceType referenceType) {
  this.loadFactor = loadFactor;
  this.shift = calculateShift(concurrencyLevel, MAXIMUM_CONCURRENCY_LEVEL);
  int size = 1 << this.shift;
  this.referenceType = referenceType;
  int roundedUpSegmentCapacity = (int) ((initialCapacity + size - 1L) / size);
  this.segments = (Segment[]) Array.newInstance(Segment.class, size);
  for (int i = 0; i < this.segments.length; i++) {
   this.segments[i] = new Segment(roundedUpSegmentCapacity);
  }
}

So the instruction is likely to be really hot. Full layout of assembly can be found in my gist

Then I run the same benchmark on JDK 14 and again use LinuxPerfAsmProfiler, but now I don't have any explicit pointing to volatile int count = 0 in captured assembly.

Looking for lock addl $0x0 instruction which is assignment of 0 under lock prefix I have found this:

  0.08%                          │  0x00007f3717d46187:   lock addl $0x0,-0x40(%rsp)
 23.74%                          │  0x00007f3717d4618d:   mov    0x120(%r15),%rbx

which is likely to correspond volatile int count = 0 because it follows constructor call of Segment's superclass ReentrantLock:

  0.77%                          │  0x00007f3717d46140:   movq   $0x0,0x18(%rax)              ;*new {reexecute=0 rethrow=0 return_oop=0}
                                 │                                                            ; - java.util.concurrent.locks.ReentrantLock::&lt;init&gt;@5 (line 294)
                                 │                                                            ; - org.springframework.util.ConcurrentReferenceHashMap$Segment::&lt;init&gt;@6 (line 484)
                                 │                                                            ; - org.springframework.util.ConcurrentReferenceHashMap::&lt;init&gt;@141 (line 184)
  0.06%                          │  0x00007f3717d46148:   mov    %r8,%rcx
  0.05%                          │  0x00007f3717d4614b:   mov    %rax,%rbx
  0.03%                          │  0x00007f3717d4614e:   shr    $0x3,%rbx
  0.74%                          │  0x00007f3717d46152:   mov    %ebx,0xc(%r8)
  0.06%                          │  0x00007f3717d46156:   mov    %rax,%rbx
  0.05%                          │  0x00007f3717d46159:   xor    %rcx,%rbx
  0.02%                          │  0x00007f3717d4615c:   shr    $0x14,%rbx
  0.72%                          │  0x00007f3717d46160:   test   %rbx,%rbx
                             ╭   │  0x00007f3717d46163:   je     0x00007f3717d4617f
                             │   │  0x00007f3717d46165:   shr    $0x9,%rcx
                             │   │  0x00007f3717d46169:   movabs $0x7f370a872000,%rdi
                             │   │  0x00007f3717d46173:   add    %rcx,%rdi
                             │   │  0x00007f3717d46176:   cmpb   $0x8,(%rdi)
  0.00%                      │   │  0x00007f3717d46179:   jne    0x00007f3717d46509
  0.04%                      ↘   │  0x00007f3717d4617f:   movl   $0x0,0x14(%r8)
  0.08%                          │  0x00007f3717d46187:   lock addl $0x0,-0x40(%rsp)
 23.74%                          │  0x00007f3717d4618d:   mov    0x120(%r15),%rbx

The problem is that I don't have any mention of putfield count in generated assembly at all.

Could anyone explain why I don't see it?


Solution

  • It turned out that you couldn't use hsdis built for e.g. JDK 8 with JDK 11. For the perfect match you need to build hsdis from JDK sources, then build the JDK itself and run the application on this ad-hoc build.

    This approach worked perfectly for me when I was investigating Missing bounds checking elimination in String constructor?.