Search code examples
pythondictionaryyamlconfigpyyaml

Write YAML file from python dict containing special characters (asterisk, ampersand, *, &)


I have a YAML file, that I need to import to python, process it in some way and then export as YAML file again. More precisely I import YAML config file as a dict, generate many files with altering parameters and then write them all as YAML files again.

The problem I facing is that there are parameters with special characters ($, &) before them: e.g. *target_size.

When I'm working with a dict in python, then this parameter is a dictionary value in a string format ('*target_size'). When I'm writing this dict as YAML file formatting get preserved, i.e. '*target_size' is encircled by quotes in resulting YAML file. What I need is just *target_size, same as in the original file.

I've looked pyYaml docs and other resources but didn't found the solution.

Code to write YAML file:

    with open(f'{PATH}/base_config.yml', 'w') as outfile:
         yaml.dump(config, outfile, default_flow_style=False, sort_keys=False)

original YAML python dict resulting file


Solution

  • The unquoted asterisk (*) and ampersand (&) are special characters in YAML, representing aliases and anchors. These let one portion of a YAML document refer to another portion of a YAML document.

    When you deserialize a YAML document into a Python data structure, you lose any information about the anchors and aliases that were present in the original document.

    When you serialize a Python data structure to YAML, the yaml module will automatically generate anchors and aliases where appropriate to represent self-referential data structures. For example, given this:

    >>> import yaml
    >>> doc = {'a': {'example': 'this is a test'}}
    >>> doc['b'] = doc['a']
    >>> print(yaml.safe_dump(doc))
    

    We see the following output:

    a: &id001
      example: this is a test
    b: *id001
    

    You're not going to be able to preserve these across a deserialization/serialization pipeline using the standard Python yaml module.