As part of my setup in Terraform, I create users based on JSON objects. I want to ensure user synchronization.
The JSON object is the response from a custom Python script which is getting data from an external API. Based on this JSON I'm creating users. I need to remove them also.
Python script (api.py):
import json
data = {
"test_team": {
"members": [
{"email": "abc@gmail.com", "name": "abc"},
{"email": "abcdef@gmail.com", "name": "abcdef"}
]
}
}
output = json.dumps(data)
print(output)
Terraform:
data "external" "python_output" {
program = ["python", "${path.module}/api.py"]
}
locals {
json_data = jsondecode(data.external.python_output.result)
unique_mails = distinct(flatten([
for team_key, team_data in local.json_data : [
for member in team_data["members"] : member["email"]
]
]))
}
resource "user" "user" {
for_each = { for email in local.unique_mails : email => email }
name = each.key
role = "user"
}
If the user is not on the list (missing key in JSON) Terraform should synchronize the change and remove the user (destroy resource user for missing user based on JSON)
How can I achieve it based on best practices in Terraform?
The main issue with external
data source is that the program output must be a JSON encoded map of string keys and string values:
The program must then produce a valid JSON object on stdout, which will be used to populate the result attribute exported to the rest of the Terraform configuration. This JSON object must again have all of its values as strings. On successful completion it must exit with status zero.
In your case, your output is slightly different as you have arrays, and nested maps):
{
"test_team": {
"members": [
{
"email": "abc@gmail.com",
"name": "abc"
},
{
"email": "abcdef@gmail.com",
"name": "abcdef"
}
]
}
}
If you'd like to only use the external
data source, you would have to format your output in your script to be a map of string keys and string values. I don't recommend doing that, as:
jsondecode
won't transform that natively)In your case, I recommend using an intermediate file to store the result of your script:
import json
data = {
"test_team": {
"members": [
{"email": "abc@gmail.com", "name": "abc"},
{"email": "abcdef@gmail.com", "name": "abcdef"}
]
}
}
output = json.dumps(data)
with open('api_result.json', 'w') as f:
f.write(output)
print('{}')
Here, there are 2 things to note:
api_result.json
){}
) if the script succeeds, as expected by external
data source docsOnce this script is modified, you can use the local_file
data source to read that file. You will need to add an explicit dependency (data.external.python_script
) to be sure the file is read after your script run. From there, you can load the contents of api_result.json
inside a local
variable (json_data
) with jsondecode
, and do your own logic. Here's the final result (with comments for explanations):
// python script api.py is writing the result to api_result.json
data "external" "python_script" {
program = ["python", "${path.module}/api.py"]
}
// this is the output of api.py
data "local_file" "python_output" {
filename = "${path.module}/api_result.json"
// we add an explicit dependency as this file will exist or be up-to-date after api.py has run
depends_on = [data.external.python_script]
}
locals {
// json_data is based on local_file content
json_data = jsondecode(data.local_file.python_output.content)
unique_mails = distinct(flatten([
for team_key, team_data in local.json_data : [
for member in team_data["members"] : member["email"]
]
]))
}
resource "user" "user" {
for_each = { for email in local.unique_mails : email => email }
name = each.key
role = "user"
}
This will create one user
resource per unique mail. If your api.py
script :
user
resource will be created for that useruser
resource will be destroyedIf you'd like to test that behavior, you can define a local_file
resource:
// will create one file per email
// filename = email, and content is empty as we don't care
resource "local_file" "test" {
for_each = { for email in local.unique_mails : email => email }
filename = "${path.module}/${each.key}"
content = ""
}
And check there are as many files created in your current directory than unique emails.
Since api_result.json
may contain sensitive data (emails), I strongly suggest you to add it in .gitignore
. Anyway, terraform apply
will create that file again, so as long as the user or CI/CD process running the terraform
command can run the script (e.g. it has Python installed), this file will be regenerated.