Search code examples
pythonjsonlistelasticsearchpython-requests

Python-automated bulk request for Elasticsearch not working "must be terminated by a newline"


I am trying to automate a bulk request for Elasticsearch via Python.

Therefore, i am preparing the data for the request body as follows (saved in a list as separate rows):

data = [{"index":{"_id": ID}}, {"tag": {"input": [tag], "weight":count}}]

Then i will use requests to do the Api call:

r = requests.put(endpoint, json = data, auth = auth)

This is giving me the Error: b'{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The bulk request must be terminated by a newline [\\n]"}],"type":"illegal_argument_exception","reason":"The bulk request must be terminated by a newline [\\n]"},"status":400}'

I know that i need to put a newline at the end of the request, and there lies my problem: How can i append a newline to that given data structure? I tried to append '\n' to my list at the end but that didnt work out.

Thank you guys!


Solution

  • The payload's content type must be ndjson and the index attribute needs be specified as well. Here's a working snippet:

    import requests
    import json
    
    endpoint = 'http://localhost:9200/_bulk'
    
    
    #                  vvvvvv
    data = [{"index": {"_index": "123", "_id": 123}},
            {"tag": {"input": ['tag'], "weight":10}}]
    
    
    #         vvv                                              vvv
    payload = '\n'.join([json.dumps(line) for line in data]) + '\n'
    
    r = requests.put(endpoint,
                     # `data` instead of `json`!
                     data=payload,
                     headers={           
                         # it's a requirement
                         'Content-Type': 'application/x-ndjson'
                     })
    
    print(r.json())
    

    P.S.: You may want to consider the bulk helper in the official py client.