Search code examples
pythonyamlpyyaml

How to encase the value of a yaml in single quotes after a dictionary <> yaml serialization


I'd like to convert my dictionary to a YAML document, where the keys are rendered without quotes, but the values are encased in single quotes.

I found several solutions to encase both the key and value in a single quote, but that's not what I'd like. Below you can see an example script:

import yaml

theDict = {'this' : {'is': 'the', 'main': 12,'problem':'see?' }}

print(yaml.dump(theDict, default_flow_style=False, sort_keys=False))

This will output:

this:
    is: the
    main: 12
    problem: see?

However, I want:

this:
  is: 'the'
  main: '12'
  problem: 'see?'

I don't want:

this:
  'is': 'the'
  'main': '12'
  'problem': 'see?'

I also don't want:

'this':
  'is': 'the'
  'main': '12'
  'problem': 'see?'

The answers that have been flagged as a duplicate of this are not duplicates because the question wants both the key and value encased in quotes. This is not what I would like to do. I'd like the serialization of yaml to occur and then the values (not the keys) encased in a quote.


Solution

  • The representers in YAML libraries for Python that I do know, do not get context information, so it is non-trivial to distinguish between scalars that are being dumped, that are keys, and those that are values.

    What you should do is create a class that behaves like a string, but dumps, explicitly, with single quotes. This can be done in PyYAML, but I recommend you upgrade to my ruamel.yaml, which not only has this functionality built-in (so it can preserve quotes on round-tripping a YAML document), but also implements the YAML 1.2 standard released in 2009, where PyYAML implements the older YAML 1.1 standard with some extra restrictions.

    In both cases (upgrade or not), this means you need to explicitly indicate those elements that should be single-quoted, either while you create your datastructure, or by (recursively) walking over the datastructure, updating the values, before dumping.

    Here I do so during creation:

    import sys
    import ruamel.yaml
    
    S = ruamel.yaml.scalarstring.SingleQuotedScalarString
    
    theDict = {'this' : {'is': S('the'), 'main': S(12),'problem':S('see?') }}
    
        
    yaml = ruamel.yaml.YAML()
    yaml.dump(theDict, sys.stdout)
    

    which gives:

    this:
      is: 'the'
      main: '12'
      problem: 'see?'
    

    But this will make your 12 behave like a string in your Python code. If that is undesirable, you need to either convert just before dumping, or make an extra subclass of int that dumps as a SingleQuotedScalarString (such a hybrid is not available off-the-shelve in ruamel.yaml)


    As an aside: when using PyYAML, you should not be make it habit of coding:

    print(yaml.dump(data, ...))
    

    that creates an unnessary memory buffer for the output, and then streams that out and discards the buffer. Instead directly stream to sys.stdout, which is faster (and far less likely to give you memory errors on dumping large data structures):

    yaml.dump(data, stdout, ...)