Search code examples
python-3.xvalidationcerberus

Using cerberus regex to validate string ends with pattern


The cerberus library says that it allows for regex validation, but that doesn't seem to work across a variety of cases and the documentation is scarce. In the instance of trying to validate that a string ends with ".csv" the validation always fails and even when searching for parts of the file name itself. I assume there is something cerberus is doing on the backend to the regex being passed in.

# -----
# Import and print versions
# -----
import sys
print(sys.version)
# >>> 3.7.4 (default, Aug 13 2019, 15:17:50) 
#     [Clang 4.0.1 (tags/RELEASE_401/final)]

import cerberus
print(cerberus.__version__)
# >>> 1.3.2

# -----
# Define schema to check file extension is ".csv"
# -----
schema1 = {
    'test': {
        'type': 'string',
        'regex': r'\.csv$'
    }
}
schema2 = {
    'test': {
        'type': 'string',
        'regex': r'\\.csv$'
    }
}
schema3 = {
    'test': {
        'type': 'string',
        'regex': r'test'
    }
}

# -----
# Instantiate validation and run examples
# -----
v = cerberus.Validator()

print(v.validate({'test': 'test.csv'}, schema1))
# >>> False

print(v.validate({'test': 'test.csv'}, schema2))
# >>> False

print(v.validate({'test': 'test.csv'}, schema3))
# >>> False

Solution

  • Indeed Cerberus adds a $ suffix to the constraint anyway and hence r".*\.csv" should work for your problem as constraint. The design rationale is that multiple matches aren't something that would be a use-case and being explicit about a whole string's structure is better than being ignorant.