I am using cerberus for validating my yaml file against predefined schema as shown below
import yaml
schema_text = '''
name:
type: string
age:
type: integer
min: 10
'''
input_text = '''
name: Little Joe *(Line 1)*
age: 5 *(Line 2)*
'''
schema_yaml = yaml.load(schema_text)
input_yaml = yaml.load(input_text)
v.validate(input_yaml , schema_yaml)
v.errors
**{'age': ['min value is 10']}**
When handling YAML validation errors, instead of just displaying the error message to the user, it would be super helpful to display the line number(s) of the validation error as well so the user can figure out what's going on.
{'age': ['min value is 10..error found at line number 2']}
Is there such option available in cerberus ? Any leads would be much helpful.
There are multiple things you should be aware of. First of all your schema_yaml
is invalid YAML as
all keys for a single mapping need to be unique. PyYAML will however happily load that overwriting
string
with integer
. You actually want to get an error message and detect you should
indent some of the lines in schema_yaml
. You should also make it a habit to add a backslash after
the opening triple-quotes, otherwise your string starts with an empty line and your
counting of line numbers will be off by one.
Using ruamel.yaml
(disclaimer: I am the author of that package) you can keep
track of the lines a key was assigned to during the creation of the mapping. The
start_mark of the key node has the line number (starting at 0):
import sys
import cerberus
import ruamel.yaml
schema_text = '''\
name:
type: string
age:
type: integer
min: 10
'''
input_text = '''\
name: Little Joe # *(Line 1)*
age: 5 # *(Line 2)*
'''
yaml = ruamel.yaml.YAML(typ='safe') # no need for linenumbers in the schema
schema = yaml.load(schema_text)
v = cerberus.Validator()
yaml = ruamel.yaml.YAML()
def my_construct_mapping(self, node, maptyp, deep=False):
if not isinstance(node, ruamel.yaml.nodes.MappingNode):
raise ruamel.yaml.constructor.ConstructorError(
None, None, f'expected a mapping node, but found {node.id!s}', node.start_mark,
)
total_mapping = maptyp
if getattr(node, 'merge', None) is not None:
todo = [(node.merge, False), (node.value, False)]
else:
todo = [(node.value, True)]
for values, check in todo:
mapping = self.yaml_base_dict_type()
for key_node, value_node in values:
# keys can be list -> deep
key = self.construct_object(key_node, deep=True)
# lists are not hashable, but tuples are
if not isinstance(key, Hashable):
if isinstance(key, list):
key = tuple(key)
if not isinstance(key, Hashable):
raise ruamel.yaml.constructor.ConstructorError(
'while constructing a mapping',
node.start_mark,
'found unhashable key',
key_node.start_mark,
)
value = self.construct_object(value_node, deep=deep)
if check:
if self.check_mapping_key(node, key_node, mapping, key, value):
mapping[key] = value
else:
mapping[key] = value
if not hasattr(self.loader, 'keyline'):
self.loader.keyline = {}
self.loader.keyline[key] = key_node.start_mark.line + 1 # ruamel.yaml start line-count at 0
total_mapping.update(mapping)
return total_mapping
yaml.Constructor.construct_mapping = my_construct_mapping
data = yaml.load(input_text)
v.validate(data, schema)
for key, val in v.errors.items():
print(f'error for key "{key}" at line {yaml.keyline[key]}: {"".join(val)}')
which gives:
error for key "age" at line 2: min value is 10