Search code examples
pythonyamlpyyaml

Dump a list in Python dict into YAML file as examples


I have a Python dict as below and I want to dump it into a YAML file. I want to dump those examples as a list of string examples instead of individual key (There is a pipe | after examples. How can I do that?

data = {'version': '1.0',
 'food': [{'category': 'japanese', 'examples': ['sushi', 'ramen', 'sashimi']},
 {'category': 'chinese', 'examples': ['hotpot', 'noodle', 'fried rice']}]}

yaml.SafeDumper.ignore_aliases = lambda *args : True
with open('food.yml', 'w', encoding = "utf-8") as yaml_file:
    dump = yaml.safe_dump(data,
                          default_flow_style=False,
                          allow_unicode=False,
                          encoding=None,
                          sort_keys=False,
                          line_break=10)
    yaml_file.write(dump)

Result

version: '1.0'
food:
- category: japanese
  examples:
  - sushi
  - ramen
  - sashimi
- category: chinese
  examples:
  - hotpot
  - noodle
  - fried rice

Expected

version: '1.0'
food:
- category: japanese
  examples: |
    - sushi
    - ramen
    - sashimi
- category: chinese
  examples: |
    - hotpot
    - noodle
    - fried rice

Solution

  • The constructs with | are literal style scalars. And within its indented context the leading dashes have no special meaning (i.e. the are not parsed as block style sequence indicators). PyYAML doesn't support those literal scalars on individual strings without adding representers.

    PyYAML only supports a subset of YAML 1.1 and that specification was superseded in 2009. The recommended extension for files containing YAML documents has been .yaml since at least [September 2006](https://web.archive.org/w eb/20060924190202/http://yaml.org/faq.html).

    It is not clear why you use these outdated libraries and conventions, but it looks like it is time to upgrade:

    import sys
    import ruamel.yaml
    
    def literal(*args):
        # convert args to a multiline string that looks like block sequence
        # convert to string in case non-strings were passed in
        return ruamel.yaml.scalarstring.LiteralScalarString(
                   '- ' + '\n- '.join([str(x) for x in args]) + '\n'
               )   
    
    data = {'version': '1.0',
     'food': [{'category': 'japanese', 'examples': literal('sushi', 'ramen', 'sashimi')},
     {'category': 'chinese', 'examples': literal('hotpot', 'noodle', 'fried rice')}]}
    
        
    yaml = ruamel.yaml.YAML()
    yaml.dump(data, sys.stdout)
    

    which gives:

    version: '1.0'
    food:
    - category: japanese
      examples: |
        - sushi
        - ramen
        - sashimi
    - category: chinese
      examples: |
        - hotpot
        - noodle
        - fried rice