Search code examples
pythonpython-3.xelasticsearchelasticsearch-5

Elastic Search Bulk Update using python, how to append a array field with new data


how to update field in elastic search using bulkupdate in python. i tried many ways its all getting error. In some cases i am getting document missing error , how do i update and upsert at same time . and also appending to field is not working.elasticsearch==7.9.1 is the package i used in python

for i in range(0, length, steps):
    end_index = length-1 if i+steps>length else i+steps
    temp_list = test_data[i: end_index]
    bulk_file = ''
    actions = [{
        "_index": "test-data",
        "_opt_type":"update",
        "_type": "test-test-data",
        "_id": test_row ['testId'],
        "doc":{"script": {
                          "source": "ctx._source.DataIds.add(params.DataIds)",
                          "lang": "painless",
                          "params": {
                              "DataIds":test_row ['DataIds']
                          }
                      }}
        }
        for test_row in temp_list
    ]
    helpers.bulk(es, actions)

Error iam getting is this

    {'update': {'_index': 'test-data', '_type': 'products', '_id': '333', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'failed
 to execute script', 'caused_by': {'type': 'script_exception', 'reason': 'runtime error', 'script_stack': ['ctx._source.dataIds.add(params.dataIds)', '
    ^---- HERE'], 'script': 'if (ctx._source.dataIds == null) { ctx._source.dataIds = []; } ctx._source.dataIds.add(params.dataIds)', 'lang': 'painless', 'position': {'offse
t': 105, 'start': 71, 'end': 118}, 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'dynamic method [java.lang.String, add/1] not found'}}}, 'data': {'upsert': {}, 'scripted_up
sert': True, 'script': {'source': 'if (ctx._source.dataIds == null) { ctx._source.dataIds = []; } ctx._source.dataIds.add(params.dataIds)', 'lang': 'painless', 'params': {'c
dataIds': 'set123'}}}}}])

Solution

  • The correct way to upsert via script is without the doc but only the script section. You also need the upsert section if you want to upsert and update in the same command. It goes like this:

    actions = [{
        "_op_type":"update",
        "_index": "test-data",
        "_type": "test-test-data",
        "_id": test_row ['testId'],
        "upsert": {
           "DataIds": test_row ['DataIds']
        },
        "script": {
            "source": "ctx._source.DataIds.add(params.DataIds)",
            "lang": "painless",
            "params": {
               "DataIds":test_row ['DataIds']
            }
        }
    } for test_row in temp_list
    ]
    

    Another way to do it is with scripted_upsert

    actions = [{
        "_op_type":"update",
        "_index": "test-data",
        "_type": "test-test-data",
        "_id": test_row ['testId'],
        "upsert": {},
        "scripted_upsert": true,
        "script": {
            "source": "if (ctx._source.DataIds == null) { ctx._source.DataIds = []; } ctx._source.DataIds.add(params.DataIds)",
            "lang": "painless",
            "params": {
               "DataIds":test_row ['DataIds']
            }
        }
    } for test_row in temp_list
    ]