My GitLab CI job failed. The message I have is ERROR: Job failed: exit status 1
. This message is not informative enough for me to troubleshoot the error.
I am implementing CI/CD for a Node.Js Express application. Before I build and deploy the Server application, I am gracefully stopping / shutting-down the instance of the application that is actually running.
This is what I am trying to do inside the stop-job
.
However, when I run the stop job, Gitlab runner will fail with the message ERROR: Job failed: exit status 1
.
This is my code, inside .gitlab-ci.yml
:
stages: # List of stages for jobs, and their order of execution
- stop
stop-job: # This job runs in the <stop> stage, which runs first.
stage: stop
script:
- echo 'Stopping job ...'
# Send a kill / shutdown message to a server application listening on port 3000 (and on port 80)
- echo 'shutdown' | nc localhost 3000 || echo 'No process listening on port 3000'
- |
while true; do
# Count the process using port 80
process_count=$(lsof -i :80 | grep LISTEN | wc -l)
# Check if no process is using port 80
if [ "$process_count" -eq 0 ]; then
echo "There is no application or process using port 80"
break
fi
echo "Port 80 is currently in use. Retrying in 5 seconds..."
# Wait 5 seconds before we retry again
sleep 5
done
# We are out of the loop and the current instance using port 80 is closed
- echo "Stopping job completed!"
only:
- main
- echo 'Stopping job completed!'
I believe the error seems to happen around this part of the code.
while true; do
# Count the process using port 80
process_count=$(lsof -i :80 | grep LISTEN | wc -l)
The command lsof -i :80
returns a value of 1
, it typically indicates that no process is currently listening on port 80. While it may seem counterintuitive, this specific exit status (return value) of 1 in this context does not necessarily indicate a command execution failure or error. Instead, it signifies that the lsof
command did not find any open files or connections associated with port 80.
set -o pipefail
was set in the shell. Because the command lsof -i :80
return 1
to indicate there is no process using port 80 , and the pipefail tells the shell to treat any failure in a pipeline as fatal (rather than only using the last command's exit status).
There I was.
If running with set -o pipefail
, a failure at any stage in a shell pipeline will cause the entire pipeline to be considered failed.
To turn this off for the remainder of the current script with set +o pipefail
To turn this back on for the remainder of the current script with set -o pipefail
The following is the partial code which needed the correction in the script:
...
# Count the process using port 80
set +o pipefail
process_count=$(lsof -i :80 | grep LISTEN | wc -l)
set -o pipefail
...