Search code examples
djangoamazon-web-servicesdjango-haystack

Django-haystack with AWS Opensearch


I am trying to implement a project which I have running locally but want to implement on AWS.

I am running locally in a Docker container and I am running an instance of ElasticSearch for the project. In production I want to run AWS OpenSearch. When I run manage.py update_index using the docker container elastic search everything works. When I try and implement on AWS with OpenSearch I get the following.

GET https://opensearch-instance.eu-west-2.es.amazonaws.com:443/haystack-local/_mapping [status:401 request:0.151s]
Undecodable raw error response from server: Expecting value: line 1 column 1 (char 0)
PUT https://opensearch-instance.eu-west-2.es.amazonaws.com:443/haystack-local [status:401 request:0.087s]
Undecodable raw error response from server: Expecting value: line 1 column 1 (char 0)
POST https://opensearch-instance.eu-west-2.es.amazonaws.com:443/haystack-local/modelresult/_bulk [status:400 request:0.061s]
POST https://opensearch-instance.eu-west-2.es.amazonaws.com:443/haystack-local/modelresult/_bulk [status:400 request:0.060s]
POST https://opensearch-instance.eu-west-2.es.amazonaws.com:443/haystack-local/modelresult/_bulk [status:400 request:0.058s]
POST https://opensearch-instance.eu-west-2.es.amazonaws.com:443/haystack-local/modelresult/_bulk [status:400 request:0.058s]
POST https://opensearch-instance.eu-west-2.es.amazonaws.com:443/haystack-local/modelresult/_bulk [status:400 request:0.058s]
--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 119, in do_update
    backend.update(index, current_qs, commit=commit)
  File "/usr/local/lib/python3.11/site-packages/haystack/backends/elasticsearch_backend.py", line 239, in update
    bulk(
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 257, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 188, in streaming_bulk
    for data, (ok, info) in zip(
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 99, in _process_bulk_chunk
    raise e
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 95, in _process_bulk_chunk
    resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
    return func(*args, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/client/__init__.py", line 1173, in bulk
    return self.transport.perform_request('POST', _make_path(index,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/transport.py", line 312, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 129, in perform_request
    self._raise_error(response.status, raw_data)
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/connection/base.py", line 125, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: <exception str() failed>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/logging/__init__.py", line 1110, in emit
    msg = self.format(record)
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/logging/__init__.py", line 953, in format
    return fmt.format(record)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/logging/__init__.py", line 687, in format
    record.message = record.getMessage()
                     ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/logging/__init__.py", line 377, in getMessage
    msg = msg % self.args
          ~~~~^~~~~~~~~~~
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/exceptions.py", line 55, in __str__
    cause = ', %r' % self.info['error']['root_cause'][0]['reason']
                     ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
TypeError: string indices must be integers, not 'str'
Call stack:
  File "/app/src/manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 297, in handle
    self.update_backend(label, using)
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 342, in update_backend
    max_pk = do_update(
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 147, in do_update
    LOG.error(error_msg, error_context, exc_info=True)
Message: 'Failed indexing %(start)s - %(end)s (retry %(retries)s/%(max_retries)s): %(exc)s (pid %(pid)s): %(exc)s'
Arguments: {'start': 1, 'end': 3, 'retries': 5, 'max_retries': 5, 'pid': 1, 'exc': RequestError(400, 'no handler found for uri [/haystack-local/modelresult/_bulk] and method [POST]', {'error': 'no handler found for uri [/haystack-local/modelresult/_bulk] and method [POST]'})}
[ERROR/MainProcess] Error updating events using default 
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 297, in handle
    self.update_backend(label, using)
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 342, in update_backend
    max_pk = do_update(
             ^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 119, in do_update
    backend.update(index, current_qs, commit=commit)
  File "/usr/local/lib/python3.11/site-packages/haystack/backends/elasticsearch_backend.py", line 239, in update
    bulk(
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 257, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 188, in streaming_bulk
    for data, (ok, info) in zip(
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 99, in _process_bulk_chunk
    raise e
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 95, in _process_bulk_chunk
    resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
    return func(*args, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/client/__init__.py", line 1173, in bulk
    return self.transport.perform_request('POST', _make_path(index,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/transport.py", line 312, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 129, in perform_request
    self._raise_error(response.status, raw_data)
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/connection/base.py", line 125, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: <exception str() failed>
Traceback (most recent call last):
  File "/app/src/manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 297, in handle
    self.update_backend(label, using)
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 342, in update_backend
    max_pk = do_update(
             ^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/haystack/management/commands/update_index.py", line 119, in do_update
    backend.update(index, current_qs, commit=commit)
  File "/usr/local/lib/python3.11/site-packages/haystack/backends/elasticsearch_backend.py", line 239, in update
    bulk(
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 257, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 188, in streaming_bulk
    for data, (ok, info) in zip(
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 99, in _process_bulk_chunk
    raise e
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 95, in _process_bulk_chunk
    resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
    return func(*args, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/client/__init__.py", line 1173, in bulk
    return self.transport.perform_request('POST', _make_path(index,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/transport.py", line 312, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 129, in perform_request
    self._raise_error(response.status, raw_data)
  File "/usr/local/lib/python3.11/site-packages/elasticsearch/connection/base.py", line 125, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: <exception str() failed>

I realise I should have OpenSearch in my Docker container instead (but I'm taking over someone else's project so will get there). What I cannot work out is how to get django-haystack working with OpenSearch. I think I must change the Haystack connection settings, which are currently

HAYSTACK_CONNECTIONS = {
    "default": {
        "ENGINE": "haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine",
        "URL": os.environ.get("ELASTICSEARCH_URL"),
        "INDEX_NAME": f"white-eagle-lodge-{DEPLOY_ENV}",  # noqa: F405
        "INCLUDE_SPELLING": True,
        "TIMEOUT": 180,
    }
}

I am concerned Haystack doesn't provide an ENGINE for OpenSearch. Could anyone help with how to set it up. Thanks


Solution

  • I've noticed a couple of similar questions without an answer. I have this working.

    In requirements.txt: elasticsearch<7.14

    In settings.py:

    import elasticsearch
    from requests_aws4auth import AWS4Auth
    awsauth = AWS4Auth(
        os.environ.get('AWS_ACCESS_KEY_ID', '<AWS_ACCESS_KEY>'),
        os.environ.get('AWS_SECRET_ACCESS_KEY', '<AWS_SECRET_KEY>'),
        'eu-west-2',
        'es')
    
    HAYSTACK_CONNECTIONS = {
        "default": {
            "ENGINE": "haystack.backends.elasticsearch7_backend.Elasticsearch7SearchEngine",
            "URL": os.environ.get("ELASTICSEARCH_URL"),
            "INDEX_NAME": f"haystack-{DEPLOY_ENV}",  # noqa: F405
            "INCLUDE_SPELLING": True,
            "TIMEOUT": 180,
            'KWARGS': {
                'port': 443,
                'http_auth': awsauth,
                'use_ssl': True,
                'verify_certs': True,
                'connection_class': elasticsearch.RequestsHttpConnection,
            }
        }
    }
    

    This is working for me :)