I have usual stages in a terraform CI i.e. init
> validate
> plan
etc. The first step i.e. init
works fine always.
But when we reach the next stage for e.g. validate
I get following error:
$ terraform validate
103╷
104│ Error: Missing required provider
105│
106│ This configuration requires provider registry.terraform.io/datadog/datadog,
107│ but that provider isn't available. You may be able to install it
108│ automatically by running:
109│ terraform init
110╵
Now if a run init
in the same stage as validate
it works fine. So basically, a workaround is to either have all commands in one stage or have init
at every stage, neither of which is ideal of course.
If I login to runner server and manually browse the .terraform
directory the provider executable
is there. But if I run terraform validate
from shell it will again fail with the same error, however if I run init
and then validate
now it works.
No changes in .terraform
directory and its contents before and after init
. Same files, just updated creation datetimes.
If I go back to gitlab and re-run the validate
stage which will fail but then I came back to server shell and do terraform validate
again it will again fail, again no obvious changes in directory contents or permissions. Do init
again and it will start working again.
As per my understanding the only difference between these stages is cache zip/unzip since .terraform
folder is passed on as a cache.
In job console I can see following message:
Checking cache for terraform...
Runtime platform arch=amd64 os=linux pid=3798191 revision=90daeee0 version=14.7.0
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
Successfully extracted cache
Another thing to notice is though downloaded modules are also present in .terraform
it never throws an error regarding module but only about providers. I guess its something to do with .exe
files?
config.toml
:
[[runners]]
name = "cicd_terraform"
url = "***"
token = "****"
executor = "shell"
[runners.custom_build_dir]
Earlier, an empty runners.cache
section was there but situation was same so i removed it. I want it to use local directory as cache.
.gitlab-ci.yml
:
cache:
key: terraform
paths:
- .terraform
before_script:
- echo -e "credentials \"$CI_SERVER_HOST\" {\n token = \"$CI_JOB_TOKEN\"\n}" > $TF_CLI_CONFIG_FILE
- cd ${TF_ROOT}
- export TF_LOG_CORE=TRACE
- export TF_LOG_PATH=${TF_ROOT}/terraform_logs.txt
- ls -al
- ls -al ${TF_ROOT}
- echo "$TF_ROOT"
stages:
- initialize
- validate
init:
stage: initialize
script:
- terraform -v
- terraform init -backend-config="*****" -backend-config="*****.tfstate" -backend-config="*****-1" -backend-config="access_key=${AWS_ACCESS_KEY_ID}" -backend-config="secret_key=${AWS_SECRET_ACCESS_KEY}" -input=false -no-color
validate:
stage: validate
script:
- terraform validate
ls -al ${TF_ROOT}/.terraform/providers/registry.terraform.io/datadog/datadog/2.24.0/linux_amd64
total 29256
drwxr-xr-x 2 gitlab-runner gitlab-runner 4096 Feb 19 01:36 .
drwxr-xr-x 3 gitlab-runner gitlab-runner 4096 Feb 19 01:36 ..
-rw-r--r-- 1 gitlab-runner gitlab-runner 48216 Feb 19 01:36 CHANGELOG.md
-rw-r--r-- 1 gitlab-runner gitlab-runner 16725 Feb 19 01:36 LICENSE
-rw-r--r-- 1 gitlab-runner gitlab-runner 12450 Feb 19 01:36 LICENSE-3rdparty.csv
-rw-r--r-- 1 gitlab-runner gitlab-runner 1524 Feb 19 01:36 README.md
-rwxr-xr-x 1 gitlab-runner gitlab-runner 29859840 Feb 19 01:36 terraform-provider-datadog_v2.24.0
Any idea, what am I doing wrong?
Two things must be true in order for Terraform to be able to find a particular provider:
.terraform.lock.hcl
file must specify a selected version for that provider, and the allowed plugin checksums for that version..terraform/providers
-- the local plugin cache directory -- which matches one of the checksums.From what you shared it seems like the second of these is being handled by you passing the cache between steps using features of your CI system.
In order for the first to be true though, you'll need to run terraform init
on your development machine in order to generate the .terraform.lock.hcl
file and then check that file into version control as part of your configuration, which will hopefully then make your CI system place it in the right place as a normal part of checking out the source code.
When running terraform init
in a non-interactive environment like this I would suggest adding the -lockfile=readonly
option, which will cause Terraform to fail with an error if the lock file has become inconsistent with the rest of the configuration. That'll then allow your CI system to catch this problem early in the first step and return an explicit error about it, whereas in your current workflow terraform init
can update the lock file itself but that then doesn't carry forward to the other steps, causing strange downstream errors.