Search code examples
python-3.xescapingyamlpyyamlruamel.yaml

Why are PyYAML and ruamel.yaml escaping special characters when single quoted?


I have a YAML file and would like to constrain a certain field to contain no whitespace.

Here's a script that demonstrates my attempt:

test.py

#!/usr/bin/env python3

import os
from ruamel import yaml

def read_conf(path_to_config):
    if os.path.exists(path_to_config):
        conf = open(path_to_config).read()
        return yaml.load(conf)
    return None

if __name__ == "__main__":
    settings = read_conf("hello.yaml")
    print("type of name: {0}, repr of name: {1}".format(type(
             settings['foo']['name']), repr(settings['foo']['name'])))
    if any(c.isspace() for c in settings['foo']['name']):
        raise Exception("No whitespace allowed in name!")

Here is my first cut of the YAML file:

hello.yaml

foo:
    name: "hello\t"

In the above YAML file, an exception is correctly raised:

type of name: <class 'str'>, repr of name: 'hello\t'
Traceback (most recent call last):
  File "./test.py", line 16, in <module>
    raise Exception("No whitespace allowed in name!")
Exception: No whitespace allowed in name!

However, if I change the double quotes to single quotes, no exception is raised:

08:23 $ ./test.py 
type of name: <class 'str'>, repr of name: 'hello\\t'

This behavior occurs both when using ruamel.yaml==0.11.11 and PyYAML=3.11.

Why is there a difference between single and double quotes in these Python YAML parsers when, as I understand it, there is no functional difference between them in YAML specs? How can I prevent special characters from being escaped?


Solution

  • There is a vast difference in the YAML specification between single and double quoted strings. Within single quoted scalars you can only escape the single quote:

    The single-quoted style is specified by surrounding “'” indicators. Therefore, within a single-quoted scalar, such characters need to be repeated. This is the only form of escaping performed in single-quoted scalars.

    Therefore \ in 'hello\t' has no special function and that scalar consists of the letters h, e, l (2x), o. \ and t

    Backslash escaping is only supported in double quoted YAML scalars.