I have a YAML file and would like to constrain a certain field to contain no whitespace.
Here's a script that demonstrates my attempt:
test.py
#!/usr/bin/env python3
import os
from ruamel import yaml
def read_conf(path_to_config):
if os.path.exists(path_to_config):
conf = open(path_to_config).read()
return yaml.load(conf)
return None
if __name__ == "__main__":
settings = read_conf("hello.yaml")
print("type of name: {0}, repr of name: {1}".format(type(
settings['foo']['name']), repr(settings['foo']['name'])))
if any(c.isspace() for c in settings['foo']['name']):
raise Exception("No whitespace allowed in name!")
Here is my first cut of the YAML file:
hello.yaml
foo:
name: "hello\t"
In the above YAML file, an exception is correctly raised:
type of name: <class 'str'>, repr of name: 'hello\t'
Traceback (most recent call last):
File "./test.py", line 16, in <module>
raise Exception("No whitespace allowed in name!")
Exception: No whitespace allowed in name!
However, if I change the double quotes to single quotes, no exception is raised:
08:23 $ ./test.py
type of name: <class 'str'>, repr of name: 'hello\\t'
This behavior occurs both when using ruamel.yaml==0.11.11
and PyYAML=3.11
.
Why is there a difference between single and double quotes in these Python YAML parsers when, as I understand it, there is no functional difference between them in YAML specs? How can I prevent special characters from being escaped?
There is a vast difference in the YAML specification between single and double quoted strings. Within single quoted scalars you can only escape the single quote:
The single-quoted style is specified by surrounding “'” indicators. Therefore, within a single-quoted scalar, such characters need to be repeated. This is the only form of escaping performed in single-quoted scalars.
Therefore \
in 'hello\t'
has no special function and that scalar consists of the letters h
, e
, l
(2x), o
. \
and t
Backslash escaping is only supported in double quoted YAML scalars.