I need to update several thousand items every several minutes in Elastic and unfortunately reindexing is not an option for me. From my research the best way to update an item is using _update_by_query - I have had success updating single documents like so -
{
"query": {
"match": {
"itemId": {
"query": "1"
}
}
},
"script": {
"source": "ctx._source.field = params.updateValue",
"lang": "painless",
"params": {
"updateValue": "test",
}
}
}
var response = await Client.UpdateByQueryAsync<dynamic>(q => q
.Index("masterproducts")
.Query(q => x.MatchQuery)
.Script(s => s.Source(x.Script).Lang("painless").Params(x.Params))
.Conflicts(Elasticsearch.Net.Conflicts.Proceed)
);
Although this works it is extremely inefficient as it generates thousands of requests - is there a way in which I can update multiple documents with a matching ID in a single request? I have already tried Multiple search API which it would seem cannot be used for this purpose. Any help would be appreciated!
If possible, try to generalize your query.
Instead of targeting a single itemId
, perhaps try using a terms
query:
{
"query": {
"terms": {
"itemId": [
"1", "2", ...
]
}
},
"script": {
...
}
}
From the looks of it, your (seemingly simplified) script sets the same value, irregardless of the document ID / itemId
. So that's that.
If the script does indeed set different values based on the doc IDs / itemIds
, you could make the params
multi-value:
"params": {
"updateValue1": "test1",
"updateValue2": "test2",
...
}
and then dynamically access them:
...
def value_to_set = params['updateValue' + ctx._source['itemId']];
...
so the target doc is updated with the corresponding value.