Search code examples
pythonregexpydantic

Pydantic validation. Check that a string does not contain certain characters


I need to make sure that the string does not contain Cyrillic characters. I check like this:

from pydantic import BaseModel, Field

class MyModel(BaseModel):
    content_en: str = Field(pattern=r"[^а-яА-Я]")


data = MyModel(content_en="Has wrong content 'йцукен'")
print(data)
>>> content_en="Has wrong content 'йцукен'"

But when I pass strings containing Cyrillic alphabet into the content_en field, an error is not raised.
Expected:

pydantic_core._pydantic_core.ValidationError: 1 validation error for MyModel
...

How to check correctly?
Python 3.8
Pydantic 2.5

Solution (thanks @chepner):

class MyModel(BaseModel):
    content_en: str = Field(pattern=r"^[^а-яА-ЯёЁ]*$")

Solution

  • Your pattern matches any string containing at least one non-Cyrillic character, not a string consisting solely of non-Cyrillic characters.

    >>> MyModel(content_en="Has wrong content 'йцукен'")
    MyModel(content_en="Has wrong content 'йцукен'")
    >>> MyModel(content_en="йцукен")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/chepner/py311/lib/python3.11/site-packages/pydantic/main.py", line 164, in __init__
        __pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
    pydantic_core._pydantic_core.ValidationError: 1 validation error for MyModel
    content_en
      String should match pattern '[^а-яА-Я]' [type=string_pattern_mismatch, input_value='йцукен', input_type=str]
        For further information visit https://errors.pydantic.dev/2.5/v/string_pattern_mismatch
    

    The correct pattern is ^[^а-яА-Я]*$:

    >>> class MyModel(BaseModel):
    ...     content_en: str = Field(pattern=r"^[^а-яА-Я]*$")
    ...
    >>> MyModel(content_en="Has wrong content 'йцукен'")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/chepner/py311/lib/python3.11/site-packages/pydantic/main.py", line 164, in __init__
        __pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
    pydantic_core._pydantic_core.ValidationError: 1 validation error for MyModel
    content_en
      String should match pattern '^[^а-яА-Я]*$' [type=string_pattern_mismatch, input_value="Has wrong content 'йцукен'", input_type=str]
        For further information visit https://errors.pydantic.dev/2.5/v/string_pattern_mismatch