We have a process where our web services create log records in ElasticSearch (C#, using NEST). The ES index names include the month and year.
An aggregation program (C#, not using NEST) pulls near real-time information from the various logs. It consists of a date histogram, some terms (host, ip, etc), and the summation of some integer fields. It makes a request similar to this:
"query": {
"aggs": {
"myBuckets": {
"composite": {
"sources": [
"aggregations": {
The problem lies in these integer fields, in that occasionally a rogue/buggy web service will use a string instead of an integer. This causes ES to change the index's mapping of the field (from integer to string), and breaks the aggregator.
Fixing the index through a re-index is not an option, we'd prefer to handle this on-the-fly if possible.
My current plan is to read the index's map and switch the summation aggregation to a painless script similar to this:
doc['badField.keyword'].value!=null ? Integer.parseInt(doc['badField.keyword'].value) : 0
Is there a better way to handle this situation? If not, is there a more robust way of scripting the integer conversion?
... ES to change the index's mapping of the field
ES will never change the mapping of a field once it's created. The only way this can happen is if the first record you send has a string value instead of an integer value.
You can easily overcome this by using creating an index template before you index your first record:
PUT _template/my-template
"index_patterns": ["my-index*"],
"mappings": {
"_doc": {
"properties": {
"my_integer_field": {
"type": "integer" <---- this will always be honored
"ignore_malformed": true <---- ignore if the value really isn't an integer