I'm experiencing some odd behavior with a Jenkins build (Jenkins project is a multi-branch pipeline with the Jenkinsfile provided by the source repository). The last step is to deploy the application which involves replacing an artifact on a remote host and then restarting the process that runs it.
Everything works perfectly except for one problem - the service is no longer running after the build completes. I even added some debugging messages after the restart script to prove with the build output that it really was working. But for some reason, after the build exits the service is no longer running. I've done extensive testing to ensure Jenkins connects to the remote host as the correct user, has the right env vars set, etc. Plus, the restart script output is very detailed in the first place - there would be no way to get the successful output if it didn't actually work. So I am assuming the process that runs the deploy steps on the remote host is doing something else after the build completes execution.
Here is where it gets weird: if I run the same exact deploy commands using the Script Console for the same exact remote host, it works. And the service isn't stopped after successfully starting up.
By "same exact" I mean the script is the same, but the DSL is different between the Script Console and the pipeline. For example, in the Script Console, I use
println "deployscript.sh <args>".execute().text
Whereas in the pipeline I use
pipeline {
agent {
node 'mynode'
}
stages {
/*other stages commented out for testing*/
stage('Deploy') {
steps {
script {
sh 'deployscript.sh <args>'
}
}
}
}
}
I also don't have any issues running the commands manually via SSH.
Does anyone know what is going on here? Is there a difference in how the Script Console vs the Build Agent connects to the remote host? Do either of these processes run other commands? I understand that the SSH session is controlled by a Java process, but I don't know much else about the Jenkins implementation.
If anyone is curious about the application itself, it is a Progress Application Server for OpenEdge (PASOE) instance. The deploy process involves un-deploying the old WAR file, deploying the new one, and then stopping/starting the instance.
UPDATE: I added 60-second sleep to the end of the deploy script to give me time to test the service before the Jenkins process ended. This was successful, so I am certain that when the Jenkins build process exits is when it causes the service to go down. I am not sure if this is an issue with Jenkins owning a process, but again the Script Console handles this fine...
Found the issue. It's buried away in some low-level Jenkins documentation, but Jenkins builds have a default behavior of killing any processes spawned by the build. This confirms that Jenkins was the culprit and the build indeed was running correctly. It was just being killed after the build completed.
The fix is to set the value of the BUILD_ID environment variable (JENKINS_NODE_COOKIE for pipeline, like in my situation) to "dontKillMe".
For example:
pipeline {
agent { /*set agent*/ }
environment {
JENKINS_NODE_COOKIE="dontKillMe"
}
stages { /*set build stages*/ }
}
See here for more details: https://wiki.jenkins.io/display/JENKINS/ProcessTreeKiller