Search code examples
amazon-web-serviceselasticsearchaws-glueopensearch

Connecting to AWS Opensearch with Glue


I have an AWS Glue job (Glue version 2 in Python 3) which used to load data into an Elasticsearch cluster hosted on EC2 instances. The connection was made with a dependent JAR (elasticsearch-spark-20_2.11-7.8.1.jar). We have now moved to a managed Opensearch 1.2 cluster (HTTPS required, does not have fine-grained access enabled) and I'm trying to figure out how to connect to this new cluster with Glue. The OS cluster is in the private VPC which the glue job role has access to. I have also provided the glue role full access to the OS service for testing purposes. I have tried:

  1. Updating the ES JAR to version elasticsearch-spark-20_2.11-7.13.4.jar, and connecting using
    'org.elasticsearch.spark.sql'
).mode(
    'overwrite'
).option(
    'es.nodes', 'full_https_endpoint' 
).option(
    'es.port', 443
).option(
    'es.resource', '%s' % ('index_name'),
).option(
    'es.nodes.wan.only', True
).save()

but I get the error "An error occurred while calling o328.save. Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'"

  1. Use the Elasticsearch AWS Marketplace Glue connector, and connect using

ElasticsearchConnector7134forAWSGlue10and20_node1658268217103 = glueContext.write_dynamic_frame.from_options(
    frame=dynamicFrame_fin,
    connection_type="marketplace.spark",
    connection_options={
        "path": "index_name",
        "es.nodes.wan.only": "true",
        "es.nodes": "full_https_endpoint",
        "es.port": "443",
        "connectionName": "opensearch_dev",
    },
    transformation_ctx="ElasticsearchConnector7134forAWSGlue10and20_node1658268217103",
)

but get a similar error of "An error occurred while calling o323.pyWriteDynamicFrame. Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'"

  1. I have tried to connecting to the OS cluster with Glue as well as Lambda using the method here (https://docs.aws.amazon.com/opensearch-service/latest/developerguide/request-signing.html#request-signing-python), but get timeouts when attempting to connect to the cluster

Questions:

  1. Do any of my errors suggest additional security considerations?
  2. Am I on the right track with any of the methods above?
  3. Do I need to sign the http request with fine-grained access disabled?

Solution

  • This also tripped me up for days. When you created the OpenSearch cluster, did you check "enable compatibility mode"?

    Setup example

    Without this mode enabled, if you hit your domains endpoint to retrieve the version, you'll get back 1.2.0 which the driver you've wired up isn't expecting, and it will fail in the same error you've posted.

    When you enable compatibility mode, it will report back the version number as something your driver can understand.

    Example with compatibility turned on:

    "version": {
      "number": "7.10.2",
    }
    

    The rest of your setup looks good, so hopefully this is what's blocking you.