Search code examples
kubernetesjenkinssshkeyagent

Jenkins SSH Agent in Kubernetes cannot SSH to Kubernetes master node - Host key verification failed - Using SSH Agent Plugin


Background

I have managed to run Jenkins inside a Kubernetes cluster. It is also connected to the cluster to create deployments.

I am trying to deploy something using the SSH Agent Plugin. My understanding is that I need it to SSH into the actual machine running the master node of the cluster, and then I can execute the deployment with the command:

kubectl create -f deployment.yaml

Progress so far

I have installed the SSH Agent plugin and stored the SSH Private Key in Jenkins.

I've also put the appropriate Public Key in the cluster's master node's /home/pi/.ssh folder and authorized_keys file.

I am able to SSH from another machine successfully to it.

Problem

When the Pipeline is executed, it says that it is adding the SSH-Key to the slave SSH Agent pod.

[ssh-agent] Using credentials pi (SSH credentials for the master node.)
[ssh-agent] Looking for ssh-agent implementation...    
[ssh-agent] Exec ssh-agent (binary ssh-agent on a remote machine)    
...    
Running ssh-add (command line suppressed)
Identity added: /home/jenkins/agent/workspace/Deployment@tmp/private_key_123123123123132.key (pi@pi1)
[ssh-agent] Started.

But when I try to SSH from the Jenkins slave (SSH Agent), it says that the key cannot be verified.

+ ssh [email protected] id
Host key verification failed.

Request

Could anybody point me how to fix this issue? What am I doing wrong?

Additional Details

I am testing with a slimmed down pipeline like this:

// Start the Pipeline
pipeline {
  // Define the agent where it will run
  agent {
      // kubernetes = kubernetes cloud in Jenkins
      kubernetes{
      }
  }
// Start declaring the stages of the pipeline
  stages { 
    // Stage #3 - Deploy the image to the production kubernetes cluster using an SSH agent
    stage('Deploy to Kubernetes Cluster'){
      steps {
        sshagent(['RPi-SSH']) {
          script {
            sh 'id'
            sh 'ssh [email protected] id'
            sh 'ssh [email protected] ls'
          }
        }
      }
    }
  }
}

With this pipeline, I can see that first id is the id of 'jenkins' in the SSH Agent node. When it tries to SSH to the master node, it just fails.


Solution

  • Probably the hosts you are trying to connect to are not in your known_hosts file. Ideally they should be, but in reality nobody bothers with that, just add them the first time you connect by adding this switch to your ssh command:

    ssh -oStrictHostKeyChecking=accept-new [email protected] id
    

    You will find recommendations to set StrictHostKeyChecking to no. It probably doesn't matter in this context, since we are dealing with transient containers and their known_hosts files will disappear once the pipeline is done, but once you use it once other developers will just copy paste this to other contexts where it might matter, so... there you go.