I am using python to parse YAML files.
One of the YAML documents contain a dictionary such as follow:
scrapers:
results: //article[@class='story ']
This apparently causes a problem because the last apostrophe is preceded by a whitespace. If I could remove the whitespace it would solve the problem. However since it is an xpath I can't.
Anyone knows how I could escape that sequence? I looked into other SO question, but solution like wrapping the string in "", or using
scrapers:
results: //article[@class='story ']
or
scrapers:>
results: //article[@class='story ']
or
scrapers:
results: //article[@class='story '']
did not work.
EDIT: I am trying to open a file containing the above expression with:
import yaml
with open('/home/depot/wintergreen/yaml/scrapers.yml', 'r') as f:
scrapers = yaml.load(f)
However i receive the error: ScannerError: mapping values are not allowed here
pointing at the whitespace after story
.
I have been trying a suggestion offered by an answerer below, i.e. to create the yaml expression from a python dict. This works. I i save the yaml to file and load it back again it also does work.
However when i create the yaml by typing the exact same characters, then it does not work...
EDIT2: I think the problem stemmed from the fact that i created the yaml file on a window machine and uploaded it on a unix server.
It's easy to find the correct YAML format for a structure: create the structure in Python then use yaml.dump
to create the YAML-encoded string:
d = {'scrapers': {'results': "//article[@class='story ']"}}
print d
import yaml
print yaml.dump(d, default_flow_style=False)
The result of which is:
{'scrapers': {'results': "//article[@class='story '"}}
scrapers:
results: //article[@class='story ']
That's the correct YAML representation, so if you're having a problem, it's with the parser, not the input text. If you use the standard yaml
library it should parse fine.