I have 2 distinguished bulk uploads to perform, and the sequence that each will happen is completely unpredictable
In one load I would have the fields: SERVER_NAME
, OS
, and PROD_1_VERSION
In the other one, I would have the fields: SERVER_NAME
, OS
, and PROD_2_VERSION
My files look like this:
{"index":{"_index" : "myindex", "_id" : "MY_SERVER_1" }}
{"SERVER_NAME":"MY_SERVER_1","OS":"Ubuntu","PROD_1_VERSION":"1.0.0.5" }
{"index":{"_index" : "myindex", "_id" : "MY_SERVER_2" }}
{"SERVER_NAME":"MY_SERVER_2","OS":"Windows10","PROD_1_VERSION":"2.0.0.0" }
{"index":{"_index" : "myindex", "_id" : "MY_SERVER_3" }}
{"SERVER_NAME":"MY_SERVER_3","OS":"Fedora","PROD_1_VERSION":"2.5.0.1" }
and:
{"index":{"_index" : "myindex", "_id" : "MY_SERVER_1" }}
{"SERVER_NAME":"MY_SERVER_1","OS":"Ubuntu","PROD_2_VERSION":"6.0.0.5" }
{"index":{"_index" : "myindex", "_id" : "MY_SERVER_2" }}
{"SERVER_NAME":"MY_SERVER_2","OS":"Windows10","PROD_2_VERSION":"7.0.0.0" }
{"index":{"_index" : "myindex", "_id" : "MY_SERVER_3" }}
{"SERVER_NAME":"MY_SERVER_3","OS":"Fedora","PROD_2_VERSION":"8.5.0.1" }
"index"
the property "PROD_2_VERSION"
will be added, but "PROD_1_VERSION"
will be lost"update"
rather then "index"
(including { "doc" : ... }
before the properties ), the first load fails, as it tries to update something that does not exist yet"index"
and the second has "update"
it works, however, as mentioned, the sequence that each will happen can't be controlled.Is there a way to make it works like this:
if record exit,
use behave like 'index'
else
behave like 'update'
???
I'm not sure to totally understand your use case. But to do an "upsert" (insert or update) in a bulk into elastic search you must add
"doc_as_upsert" : true
After your doc part.
Here is the example of the official elasticsearch's documentation:
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }