With Gitlab, in the CI/CD pipeline, how can I dynamically create and save SSH key pairs as CI/CD project variables?

Per Gitlab docs, I've manually created an SSH Key pair and added them to the project's CI/CD variables to help provision and configure infrastructure. This has worked well enough for a few assets, but as the asset list grows, manually creating key pairs and adding them, or using the same key pair for multiple but unrelated assets, or later rotating keys for security reasons, is not sustainable. Instead, when a new asset is being created, I want to dynamically create a new SSH key pair, add that to the project's CI/CD variables, and then continue with my pipeline jobs with those keys.

For context, here is my repo's directory structure:

iac
|__ server1
    |__main.tf
|__ server2
...
|__ server10

In a job's before_script, it's easy enough to create the SSH Key pair, but what happens after that is where I'm struggling.

job1:
  before_script:
  - ssh-keygen -b 4096 -N '' -q -f ~/.ssh/id_rsa <<< y
  - *<what command goes here?>*

In one approach I interact with the project via curl and the api, but getting the values of the keys into the string shows the key contents in the logs. (I'm sure there is a way around this, but I haven't worked that out yet).

In another approach, I dynamically add a gitlab_ssh_keys.tf file to the server<number directory. Note that this isn't necessarily a Terraform questions as I'll use whatever is best to get the job done; but, Gitlab has a Terraform provider, and the infrastructure is managed with Terraform. This feels elegant in that should the infrastructure be destroyed, the variables will be cleaned up as well (othewise, with Bash, I'll need to make sure the script handles that)

# (rough pipeline pseudo code)
job:
  before_script:
  - ssh-keygen -b 4096 -N '' -q -f ~/.ssh/id_rsa <<< y
  - |
provider "gitlab" {
  token = var.gitlab_token # <- pre-configured in the project's CICD variables $CI_JOB_TOKEN is available, but it doesn't work here.
}

cat <<EOF > ${serverName}/gitlab_ssh_keys.tf
resource "gitlab_project_variable" "${serverName}_ssh_private_key" {
  project   = ${CI_PROJECT_ID}
  key       = "${servername}_private_ssh_key"
  value     = chomp(file("~/.ssh/id_rsa"))
  protected = true
  type = file
}

resource "gitlab_project_variable" "${serverName}_ssh_public_key {
... ... etc
}
EOF
  - <then add the file to the branch/repo dynamically, but how?>

Across both approaches, I'm struggling with the right token and/or SSH keys to have the pipeline interact with itself. GItlab has a ridiculous number of token types, key types, and use cases for each, Finding the right one has been a challenge. I'm working through this example, but the author is unclear on which one to choose and what permissions to give it.

So my first question is so that the project's pipeline can dynamically interact with itself, and given the principle of least privilege, what tokens and/or combinations of ssh keys and permissions should I use? This will be a "bot" type that's only ever used in the pipelines.

My second question is that if I use the Terraform approach, what is an efficient workflow and how do I avoid creating a loop when the pipeline updates the branch ? Note, that example mentions using [skip ci]. That may still be valid, but I haven't made it that far yet. That post is also more than 2 years old, and a lot has changed since then. There may be a better way.

Thoughts? Thanks!

Solution

I think I would try to expand upon your terraform solution for this.

instead of defining a pre_script task, I would make an entire task that would be required by what is currently you first task.

In your repository, you would add a ssh_keygen/ directory for your terraform code.

You could make use of the tls provider to generate your keys.

STEP 1: Adding tf code

First you'll need to define the list of machines you want to generate keypairs for:

variables.tf

variable "project_name" {
  type        = list(string)
  description = "The project identifier"
}

variable "ssh_hosts" {
  type        = list(string)
  description = "List of hosts to generate a keypair for"
}

main.tf -- example does not include provider definition, but you would need to define it here.

resource "tls_private_key" "ssh_key" {
  for_each = toset(var.ssh_hosts)
  algorithm   = "ED25519"
}

resource "gitlab_project_variable" "host_private_key" {
  for_each = toset(var.ssh_hosts)
  project   = var.project_name
  key       = "${each.value}_ssh_private_key"
  value     = tls_private_key.ssh_key[each.value].private_key_openssh
  protected = true
}

resource "gitlab_project_variable" "host_public_key" {
  for_each = toset(var.ssh_hosts)
  project   = var.project_name
  key       = "${each.value}_ssh_public_key"
  value     = tls_private_key.ssh_key[each.value].public_key_openssh
  protected = true
}

This way, as long as you pass the list of hosts as input for the ssh_hosts variable, you'll get out with an ssh key generated for each of them, and added to your project variables in gitlab.

Note: I'm not sure (or rather I'm almost positive that it doesn't) if gitlab is capable of reloading its environment if you add variables mid-run, this is why this stage needs to be completely separated from the rest of the pipeline.

STEP 2: .gitlab-ci.yml

You would need to add the following stage to your gitlab-ci pipeline prior to running your main tasks.


ssh_keypair_generation:
  variables:
    TF_VAR_project_name: ${CI_PROJECT_ID}
  stage: ssh_keygen
  before_script:
    - export TF_VAR_ssh_hosts=$(find iac -mindepth 1 -maxdepth 1 -type d -printf '"%f"\n' | jq -s .) # or any other logic to create a json list that contains the list of your hosts
  script:
    - terraform -chdir=./ssh_keygen init
    - terraform -chdir=./ssh_keygen plan -out=path/to/plan.tfplan
    - terraform -chdir=./ssh_keygen apply -auto-approve path/to/plan.tfplan
  rules: [] # your rules

This should allow you to dynamically create ssh keys for each host in your iac/ directory, and have them dynamically added and removed when you add/remove hosts.

In your subsequent stages, you would simply need to use the needs attribute, to make sure the keys are generated prior to running anything else, as well as ensuring the environment variables a reloaded (in a new task).

job1_job:
  needs: ["ssh_keypair_generation"]
...

Finally, if your pipeline updates its own branch, you could get away with triggering a new job by indeed using [skip-ci] in the commit message.

Please let me know if you need any more details on this.