Search code examples
elasticsearchinsert-updaterollover

Updating existing documents in ElasticSearch (ES) while using rollover API


I have a data source which will create a high number of entries that I'm planning to store in ElasticSearch. The source creates two entries for the same document in ElasticSearch:

  • the 'init' part which records init-time and other details under a random key in ES
  • the 'finish' part which contains the main data, and updates the initially created document (merges) in ES under the init's random key.

I will need to use time-based indexes in ElasticSearch, with an alias pointing to the actual index, using the rollover index. For updates I'll use the update API to merge init and finish.

Question: If the init document with the random key is not in the current index (but in an older one already rolled over) would updating it using it's key successfully execute? If not, what is the best practice to perform the update?


Solution

  • After some quietness I've set out to test it.

    Short answer: After the index is rolled over under an alias, an update operation using the alias refers to the new index only, so it will create the document in the new index, resulting in two separate documents.

    One way of solving it is to perform a search in the last 2 (or more if needed) indexes and figure out which non-alias index name to use for the update.

    Other solution which I prefer is to avoid using the rollover, but calculate index name from the required date field of our document, and create new index from the application, using template to define mapping. This way event sourcing and replaying the documents in order will yield the same indexes.