After having upgraded vom Elasticsearch 5.6.10 to 7.15.1, Json strings are indexed with escaped quotation marks. This leads to nonsense data of course. The moment I realised it was when I got the following exception:
mapping update rejected by primary java.lang.IllegalArgumentException: Limit of total fields [1000] has been exceeded
The indexing code is like:
for (...){
def idx_record = buildEsRecord(r) // getting a valid map without escape characters
if (idx_record != null) {
IndexRequest singleRequest = new IndexRequest(myIndex)['_id'].toString())
singleRequest.source(idx_record as JSON, XContentType.JSON)
BulkResponse bulkResponse = esClient.bulk(bulkRequest, RequestOptions.DEFAULT)
Debugging idx_record as JSON
shows a totally fine Json string without quotation marks being escaped, like:
"uuid": "63fa7627-7d03-465b-93a3-a498feeb6689",
"contentType": null,
"description": null,
"descriptionURL": null,
Is there something in the configuration of Elasticsearch 7 that I have missed? Can we set any parameters on the Elasticsearch client? Any other ideas?
Found the problem. As one can guess, the programming language used here is Groovy (on Grails) . idx_record as JSON
needed explicitly be converted to a String before being indexed. So the solution was simply changing:
singleRequest.source(idx_record as JSON, XContentType.JSON)
singleRequest.source((idx_record as JSON).toString(), XContentType.JSON)