Search code examples
jenkins

Jenkins build fails on slaves with java.lang.InterruptedException


Jenkins build fails with Cannot contact XXXXXXXXXXXX: java.lang.InterruptedException

from time to time. it doesn't matter if it is a spot instance or an on-demand instance

Jenkins ver. 2.60.3 
Amazon EC2 plugin 1.36

Solution

  • I monitored the slave stats and figured out that the slave was under heavy load like 1200. The slaves build a lot of docker images. The load was due to the fact that the save was I/O bound. The docker volume /var/lib/docker and workspace were EBS and EFS mounts respectively. Upgrading the linux kernel and changing the docker volume to Overlay2 solved the issue.