I have a data source which will create a high number of entries that I'm planning to store in ElasticSearch. The source creates two entries for the same document in ElasticSearch:
I will need to use time-based indexes in ElasticSearch, with an alias pointing to the actual index, using the rollover index. For updates I'll use the update API to merge init and finish.
Question: If the init document with the random key is not in the current index (but in an older one already rolled over) would updating it using it's key successfully execute? If not, what is the best practice to perform the update?
After some quietness I've set out to test it.
Short answer: After the index is rolled over under an alias, an update operation using the alias refers to the new index only, so it will create the document in the new index, resulting in two separate documents.
One way of solving it is to perform a search in the last 2 (or more if needed) indexes and figure out which non-alias index name to use for the update.
Other solution which I prefer is to avoid using the rollover, but calculate index name from the required date field of our document, and create new index from the application, using template to define mapping. This way event sourcing and replaying the documents in order will yield the same indexes.