Search code examples
javagarbage-collectionjstack

The way to solve cpu load too high of Java application


Today, I found the cpu of load of my server is too high,and the server is just running a Java application.

Here are my operation steps.

  1. I used top command to find the application's pid. The pid is 25713.

  2. I used top -H -p 25713 command to find some pids which used the most of cpu. Such as 25719 tomcat 20 0 10.6g 1.5g 13m R 97.8 4.7 314:35.22 java.

  3. I used jstack -F 25713 command to print the dump info.Such as "Gang worker#4 (Parallel GC Threads)" os_prio=0 tid=0x00007f5f10021800 nid=0x6477 runnable

  4. I searched the pid from the dump file. Then I found that the pids which used most of cpu are all like "Gang worker#4 (Parallel GC Threads)" os_prio=0 tid=0x00007f5f10021800 nid=0x6477 runnable

  5. After I used the jstack command, then the cpu became normal!

Here are my questions:

  1. Why GC Threads made the cpu load too high.
  2. Why after I used jstack command the cpu became nomal.

More than this time, every time.

Here are some normal logs.2015-10-10T10:17:52.019+0800: 71128.973: [GC (Allocation Failure) 2015-10-10T10:17:52.019+0800: 71128.973: [ParNew: 309991K->206K(348416K), 0.0051145 secs] 616178K->306393K(1009920K), 0.0052406 secs] [Times: user=0.09 sys=0.00, real=0.01 secs]

When the CPU comes too high, the GC log stay in [GC (Allocation Failure) 2015-10-10T10:18:10.564+0800: 71147.518: [ParNew:, and there is no other logs.

When I execute jstack command, the log printed

2015-10-10T10:17:50.757+0800: 53501.137: [GC (Allocation Failure) 2015-10-10T10:17:50.757+0800: 53501.137: [ParNew: 210022K->245K(235968K), 369.6907808 secs] 400188K->1
90410K(1022400K), 369.6909604 secs] [Times: user=3475.15 sys=11.69, real=369.63 secs] 

Solution

  • Just guessing, you might be affected by the futex_wait bug present in certain kernel versions.

    More generally, jstack -F sends a signal to the process, which will interrupt any thread that may be sleeping. So maybe GC threads just spin-waiting for another thread that somehow missed a wakeup. I.e. if it's indeed stuck in a GC and sending a signal fixes the problem then this may point to a locking or memory ordering bug, if not in the kernel then in the JVM.

    Instead of using jstack -F you could try sending SIGBREAK to the process and see if that has the same effect.