Search code examples
indexingescapingelasticsearch-7

Elasticsearch indexes Json with escaped quotation marks - "Limit of total fields [1000] has been exceeded"


After having upgraded vom Elasticsearch 5.6.10 to 7.15.1, Json strings are indexed with escaped quotation marks. This leads to nonsense data of course. The moment I realised it was when I got the following exception:

mapping update rejected by primary java.lang.IllegalArgumentException: Limit of total fields [1000] has been exceeded

The indexing code is like:

for (...){
  def idx_record = buildEsRecord(r)     // getting a valid map without escape characters
  if (idx_record != null) {
    IndexRequest singleRequest = new IndexRequest(myIndex)
    singleRequest.id(idx_record['_id'].toString())
    idx_record.remove('_id')
    singleRequest.source(idx_record as JSON, XContentType.JSON)
    bulkRequest.add(singleRequest)
  }
}
BulkResponse bulkResponse = esClient.bulk(bulkRequest, RequestOptions.DEFAULT)

Debugging idx_record as JSON shows a totally fine Json string without quotation marks being escaped, like:

{
    "uuid": "63fa7627-7d03-465b-93a3-a498feeb6689",
    "contentType": null,
    "description": null,
    "descriptionURL": null,
    ...
}

Is there something in the configuration of Elasticsearch 7 that I have missed? Can we set any parameters on the Elasticsearch client? Any other ideas?


Solution

  • Found the problem. As one can guess, the programming language used here is Groovy (on Grails) . idx_record as JSON needed explicitly be converted to a String before being indexed. So the solution was simply changing:

    singleRequest.source(idx_record as JSON, XContentType.JSON)

    to

    singleRequest.source((idx_record as JSON).toString(), XContentType.JSON)