Search code examples
pythonelasticsearchelasticsearch-painless

Partially update a document using scripting and add missing fields


I would like to know if It's possible to update a document using a partial document, and use a script to perform another action, for example if I add data1 and then add data2, I want my document to look like final_result. I want everything to be replaced and added except the tag field.

data1 = {"name" : "myname", "code" : 123, "tag" : "first"}

data2 = {"name" : "myname", "code" : 555, "tag" : "second", "age":"50", "children": "3"}

final_result = {"name" : "myname", "code" : 555, "tag" : ["first","second"], "age":"50", "children": "3"}

I can add tag field using this script , but I don't know how to add the missing fields at the same time, also I don't know what fields might be added in advanced.

POST myindex/_update/1
{

      "script" : {
        "source": "if(! ctx._source.tag.contains(params.tag)){if (ctx._source.tag instanceof List) { ctx._source.tag.add(params.tag) } else { ctx._source.tag = [ctx._source.tag, params.tag] }}",
        "lang": "painless",
        "params" : {
            "tag" : "sec"
        }
    }

}

I really appreciate it if anyone can give me example on how to do this in python.


Solution

  • You just need to set the new values for the fields.

    POST myindex/_update/1
    {
      "script": {
        "source": """
        if(!ctx._source.tag.contains(params.tag)){
            if (ctx._source.tag instanceof List) { 
              ctx._source.tag.add(params.tag) 
    
            } else { 
              ctx._source.tag = [ctx._source.tag, params.tag] 
            }
        }
        ctx._source.code = params.code
            """,
        "lang": "painless",
        "params": {
          "tag": "sec",
          "code": "555"
        }
      }
    }
    

    This is the same way on Python, creating the Elasticsearch instance and calling the update_by_query API

    es = Elasticsearch(['https://user:secret@localhost:443'])
    

    Or...

    es = Elasticsearch(
        ['localhost', 'otherhost'],
        http_auth=('user', 'secret'),
        scheme="https",
        port=443,
    )
    

    And then... The body is the same script you have

    self.es.update_by_query(index = indexName, body = q)