Search code examples
pythonelasticsearchmappingelastic-stackelasticsearch-py

How to validate a document against Elasticsearch mapping without indexing it in Python?


I'm working with Elasticsearch in Python and I have a scenario where I want to validate a document against an existing index's mapping before actually sending it for indexing. The goal is to ensure that the document adheres to the expected field types and constraints defined in the mapping.

For example, given the following mapping for a customized index:

"mappings": {
 "properties": {
  "title": { "type": "text" },
  "publish_date": { "type": "date" },
  "views": { "type": "integer"}
 }
}

I want to validate documents like:

{
"title": "Sample Article",
"publish_date": "2023-10-05",
"views": 1234
}

And get feedback if, for instance, views was mistakenly a string instead of an integer.

I'm aware that I could write custom validation logic to achieve this, but I'm looking for a more streamlined solution, possibly a library or tool that can handle this out of the box.

Has anyone come across a Python library or tool that offers this functionality? Or is there a recommended approach to achieve this without manually parsing the mapping and validating each field?


Solution

  • You probably don't want to have to manage your index mapping in Elasticsearch on one side and "copy/paste" the same rules in your Python code on the other side. That's doomed to failure over the long term!

    You might be looking for something like Pydantic (intro article) which allows you to drive your index mapping creation directly from your Python code validation rules.