Search code examples
sshgitlabgitlab-cigitlab-ci-runneropenssh

Gitlab CI job hangs when ssh command is run


The job runs a script on ither server through SSH (open ssh). The script is executed successfully, therefore the connection is successful. The problem is that it never disconnects. Stays in running state permanently and finally terminate by timeout (if it is not stopped manually before).

The command that fails is:

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh'

I can see on the server how the script runs correctly. It seems that the ssh connection never closes. After that command the job does not execute anything else and it stays loading infinitely.

When the script is executed from the server itself it also works correctly.


Things that I have already tried

I have tried adding exit command in different ways

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh && exit'

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh && exit 0'

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh ; exit'

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh ; exit 0'

I have also tried adding it a line after

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh'
- exit

I have also tried adding the background run command &

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh &'
- exit

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh &'

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh & ; exit'

- ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh & ; exit 0'

I have also tried to kill the ssh process

# add this line after the problematic (in its diferent ways) 
 - eval $(ssh-agent -k)

The complete script gitlab-ci.yml:

# This file is a template, and might need editing before it works on your project.
# Build JAVA applications using Apache Maven (http://maven.apache.org)
# For docker image tags see https://hub.docker.com/_/maven/

# This template uses jdk8 for verifying and deploying images
image: maven:3.6.0-jdk-8

stages:
  - build
  - deploy
  - notify
  
build:
  stage: build
  only:
    - dev
  script: "mvn clean install -Dactive.profile=dev -DskipTests -B"
  artifacts:
    paths:
      - target/*.jar
      - notifydeploy.sh
      - $DEV_SSH

deploy:
  stage: deploy
  only:
    - dev
  before_script:
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    # Run ssh-agent (inside the build environment)
    - eval $(ssh-agent -s)
    # Add the SSH key stored in SSH_PRIVATE_KEY variable to the agent store
    - ssh-add <(echo "$DEV_SSH")
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    - '[[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config'
  script:
    - scp -P 22 ./target/*.jar root@server:/home
    - ssh -o StrictHostKeyChecking=no root@server -p 22 '/home/script.sh'
    - eval $(ssh-agent -k)

notify_fail:
  stage: notify
  allow_failure: true
  only:
    - dev
  when: on_failure
  script:
    - echo "FAIL"
    
notify_success:
  stage: notify
  allow_failure: true
  only:
    - Deploy_to_dev03
  script:
    - chmod +x ./notifydeploy.sh
    - ./notifydeploy.sh

While the gitlab process is waiting for the command that hangs it, if the same script is executed from the server, the job is unlocked and ends correctly...

when looking for the process on the server with ps aux | grep script.sh, when the job executed it is shown, but then it disappears so it does not hang on the server.

any solution for this? I can't think of what else to try..

The script.sh is like:

#!/bin/bash

status_code=$(curl --write-out %{http_code} --silent --output /dev/null http://server/url/)
status_code_n=$(curl --write-out %{http_code} --silent --output /dev/null http://localhost:8761)
#Si no es igual a 404 es que esta funcionando
if [[ "$status_code" == 200  &&  "$status_code_n" == 200 ]] ; then  
    echo "Estatus c $status_code"  
    echo "Estatus n $status_code_n"
    pkill -f jar-process
    sleep 1 
    /usr/bin/java -jar -Dspring.profiles.active=dev /home/jar-process*.jar &
    sleep 1
    status_code_t=$(curl --write-out %{http_code} --silent --output /dev/null http://localhost:8090/api/)
    if [[ "$status_code_t" == 401 ]] ; then  
        echo "Estatus $status_code_t (401 is OK)"
        echo "La API se ha desplegado correctamente"
        exit 0
    else
        echo "Estatus $status_code_t"
        echo "Se ha producido algun error al desplegar"
        exit 1
    fi
else
    echo "Estatus c $status_code"
    echo "Estatus n $status_code_n"
    exit 1
fi

Solution

  • The short answer is to redirect the standard file descriptors (standard input, output, and error) for the script's java command like this:

    /usr/bin/java ... /home/jar-process*.jar > /dev/null 2>&1 < /dev/null &
    

    This prevents the java process from inheriting the script's standard output and so on. This is what is preventing ssh from closing the connection.

    Longer answer: When you run an ssh command such as:

    ssh user@remote '/home/script.sh'
    

    The remote ssh server will create a set of pipes to act as the standard input, output, and error of the remote command. After launching the command, the ssh server will keep the channel open until it sees an end-of-file condition on the pipe associated with the remote command's standard output.

    Your script is launching a process which is supposed to keep running after the script exits. The process is inheriting the pipes created by ssh as its standard descriptors. The process could theoretically write to its standard output, so the ssh server won't see an end-of-file condition on the standard output pipe until the process exits or closes its standard output.

    You could redirect output for the entire script like this:

    ssh user@remote '/home/script.sh >/dev/null 2>&1 < /dev/null'
    

    However, in your case, the script appears to produce an error message when it fails. Redirecting output for the entire script would prevent you from seeing the error message.