Search code examples
aws-lambdaamazon-neptune

AWS Neptune connection pool settings


We're using a connection pool to communicate with AWS Neptune from an AWS Lambda. Due to this, we are experiencing various connection problems. Usually, it happens after a maintenance window and requires a Neptune restart to fix it.

For example, below is an error rised in a Python Lambda after an automatic SSL certificate rollout in AWS Neptune:

Max retries exceeded with url: /endpoint/ (Caused by SSLError(SSLCertVerificationError(1, 
'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1131)')))

This behavior seems to be related to the Neptune Endpoint functionality and is mentioned in the AWS Doc

A custom endpoint for a Neptune cluster represents a set of DB instances that you choose. When you connect to the endpoint, Neptune chooses one of the instances in the group to handle the connection.

When you add a DB instance to a custom endpoint or remove it from a custom endpoint, any existing connections to that DB instance remain active.

As far as a connection is still valid, it's not removed from the pool despite its not functioning anymore.

My question: How to configure the HTTP connection pool from the client-side to address this behavior? Is there a possibility to check a Neptune connection before using it?


Solution

  • The general best practice is to assume a connection is still alive/valid and catch/reconnect when you encounter these sorts of exceptions. Example Lambda function architecture (mainly for Gremlin, but other query languages would have similar patterns) are displayed here: https://docs.aws.amazon.com/neptune/latest/userguide/lambda-functions-examples.html