Search code examples
jenkinsjenkins-plugins

Troubleshooting slow Jenkins


Jenkins is extremely slow when viewing job pages (over 3 minutes, with a cold disk cache). The main page displays fine; the problem is only when viewing pages for individual jobs.

I think that the problem started with a recent update of Jenkins+plugins, but how can I go about troubleshooting a problem like this?

How can I troubleshoot a problem like this?


Solution

  • Reproduce the problem

    First, make sure you can reproduce the problem. It helps with testing. If a performance problem only occurs when the cache is cold, then clearing the disk cache (instructions for Linux) can help.

    Disable or downgrade plugins

    Jenkins' "Manage Plugins" (under the Manage Jenkins section) lets you individually disable and downgrade plugins. If you suspect a particular plugin is causing problems, this can help you confirm.

    Use strace

    strace can show the system calls that Jenkins is doing. First, get the main Jenkins PID:

    root@server:~# ps -ef | grep jenkins
    jenkins    589     1  0 17:03 ?        00:00:00 /usr/bin/daemon --name=jenkins --inherit --env=JENKINS_HOME=/home/jenkins --output=/var/log/jenkins/jenkins.log --pidfile=/var/run/jenkins/jenkins.pid --umask=027 -- /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080 --ajp13Port=-1
    jenkins    591   589  7 17:03 ?        00:00:51 /usr/bin/java -Djava.awt.headless=true -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080 --ajp13Port=-1
    

    (The pid is 591 in this case.)

    Next, run strace. Because Jenkins is multi-threaded, you'll want to add -f to trace all threads.

    strace -p 591 -f
    

    If you're lucky, you'll find an obvious cause of slowdown. (In my case, one of the threads was repeatedly opening each previous build's build.xml for the particular job I was trying to view.)

    Use jstack

    strace monitors system calls and tells you what a process is doing; jstack shows the call stack for a process, which helps tell you why it's doing it (what it's trying to accomplish).

    jstack takes a pid and needs to run as the same user as the process you're inspecting. (See here for more details.)

    sudo -u jenkins jstack 591
    

    This displays quite a lot of information: stack traces for each of Jenkins' threads, numerous entries for library and framework code such as request handlers and XML, etc. Somewhere in there, though, you should be able to find the stack trace for the particular request handler that's running slow and some portion of the stack trace that indicates what it's trying to do.