I have a YAML file with configuration data for my application, which is dumped to a new file whenever the application is run for debugging purposes. Unfortunately, some keys in the YAML file hold sensitive data and need to be obfuscated or simply excluded from the dumped file.
Example YAML input file:
logging_config:
level: INFO
file_path: /path/to/log_file.log
database_access:
table_to_query: customer_table
database_api_key: XXX-XXX-XXX # Sensitive data, exclude from archived file
There are workarounds, of course:
But I was hoping that there was a solution similar to implementing a custom Loader
reacting to a command like !keep_secret
whenever it appears in a dict value, as it would keep my configuration files more readable.
You can use a custom representer. Here's a basic example:
import yaml
class SensitiveText:
def __init__(self, content):
self.content = content
def __repr__(self):
return self.content
def __str__(self):
return self.content
def sensitive_text_remover(dumper, data):
return dumper.represent_scalar("tag:yaml.org,2002:null", "")
yaml.add_representer(SensitiveText, sensitive_text_remover)
data = {
"logging_config": {
"level": "INFO",
"file_path": "/path/to/log_file.log"
},
"database_access": {
"table_to_query": "customer_table",
"database_api_key": SensitiveText("XXX-XXX-XXX")
}
}
print(yaml.dump(data))
This prints:
database_access:
database_api_key:
table_to_query: customer_table
logging_config:
file_path: /path/to/log_file.log
level: INFO
You can of course have a class for the database_access
instead with a representer that removes the database_api_key
altogether.