I am trying to test a number of Python 2.7 classes using unittest.
Here is the exception:
ScannerError: mapping values are not allowed here
in "<unicode string>", line 3, column 32:
... file1_with_path: '../../testdata/concat1.csv'
Here is the example the error message relates to:
class TestConcatTransform(unittest.TestCase):
def setUp(self):
filename1 = os.path.dirname(os.path.realpath(__file__)) + '/../../testdata/concat1.pkl'
self.df1 = pd.read_pickle(filename1)
filename2 = os.path.dirname(os.path.realpath(__file__)) + '/../../testdata/concat2.pkl'
self.df2 = pd.read_pickle(filename2)
self.yamlconfig = u'''
--- !ConcatTransform
file1_with_path: '../../testdata/concat1.csv'
file2_with_path: '../../testdata/concat2.csv'
skip_header_lines: [0]
duplicates: ['%allcolumns']
outtype: 'dataframe'
client: 'testdata'
addcolumn: []
'''
self.testconcat = yaml.load(self.yamlconfig)
What is the the problem?
Something not clear to me is that the directory structure I have is:
app
app/etl
app/tests
The ConcatTransform
is in app/etl/concattransform.py
and TestConcatTransform
is in app/tests
. I import ConcatTransform
into the TestConcatTransform
unittest with this import:
from app.etl import concattransform
How does PyYAML associate that class with the one defined in yamlconfig?
A YAML document can start with a document start marker ---
, but that has to be at the beginning of a line, and yours is indented eight positions on the second line of the input. That causes the ---
to be interpreted as the beginning of a multi-line plain (i.e. non-quoted) scalar, and within such a scalar you cannot have a :
(colon + space). You can only have :
in quoted scalars. And if your document does not have a mapping or sequence at the root level, as yours doesn't, the whole document can only consists of a single scalar.
If you want to keep your sources nicely indented like you have now, I recommend you use dedent
from textwrap
.
The following runs without error:
import ruamel.yaml
from textwrap import dedent
yaml_config = dedent(u'''\
--- !ConcatTransform
file1_with_path: '../../testdata/concat1.csv'
file2_with_path: '../../testdata/concat2.csv'
skip_header_lines: [0]
duplicates: ['%allcolumns']
outtype: 'dataframe'
client: 'testdata'
addcolumn: []
''')
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_config)
You should get into the habit to put the backslash (\
) at the end of your first triple-quotes, so your YAML document. If you do that, your error would have actually indicated line 2 because the document doesn't start with an empty line anymore.
During loading the YAML parser encouncters the tag !ConcatTransform
. A constructor for an object is probably registered with the PyYAML loader, associating that tag with the using PyYAML's add_constructor
, during the import.
Unfortunately they registered their constructor with the default, non-safe, loader, which is not necessary, they could have registered with the SafeLoader
, and thereby not force users to risk problems with non-controlled input.