gremlin amazon-neptune gremlinpython neptune

How to Set a Query-specific Timeout in gremlin_python for AWS Neptune?

I am using the gremlin_python library to execute a query on an AWS Neptune database and am experiencing timeouts despite setting a specific timeout threshold via the evaluationTimeout parameter. Here's the specific query and the error message I received:

count_companies_with_representatives = (
    self.reader_g.V()
    .with_("evaluationTimeout", MAX_QUERY_TIMEOUT)
    .hasLabel("company")
    .where(__.bothE("director").count().is_(P.gt(0)))
    .count()
    .next()
)

Error Message:

{"detailedMessage":"A timeout occurred within the script [RequestMessage{, requestId=xxxxx, op='bytecode', processor='traversal', args={gremlin=[[], [V(), with(evaluationTimeout, 1000000), hasLabel(company), where([[], [bothE(director), count(), is(gt(0))]]), count()]], aliases={g=g}}}]","code":"TimeLimitExceededException","requestId":"xxxxx","message":"A timeout occurred within the script [RequestMessage{, requestId=xxxxx, op='bytecode', processor='traversal', args={gremlin=[[], [V(), with(evaluationTimeout, 1000000), hasLabel(company), where([[], [bothE(director), count(), is(gt(0))]]), count()]], aliases={g=g}}}]"}

Despite setting the evaluationTimeout to MAX_QUERY_TIMEOUT (1000000 ms in this instance), the query is prematurely timing out. I need to ensure that each query respects this set timeout to manage performance effectively and avoid these premature terminations.

I attempted to set the timeout directly in the query using the evaluationTimeout parameter, expecting that this would allow the query to run for up to MAX_QUERY_TIMEOUT milliseconds before timing out. However, the error indicates that the timeout is occurring well before reaching this limit. I was expecting the timeout setting to prevent any early termination of the query and allow it sufficient time to complete under normal operational conditions. I'm seeking advice on how to correctly apply timeouts to individual queries in this environment or to understand if there's a better approach to manage query execution times within Neptune.

Solution

There are actually 3 locations where query timeouts can be set in an Amazon Neptune cluster: cluster-wide, instance-level, and per-query. The first two take precedent, as a database administrator would want to have ultimate control over the run-time of any given query. By default, the timeout at the cluster or instance level is 2 minutes. This can not be exceeded via the evaluationTimeout value within a query. The query-level timeout can only be set as less than the overall cluster or instance level timeout.

You'll need to adjust either the cluster or instance level neptune_query_timeout parameter to allow queries to exceed the default 2 minute timeout: https://docs.aws.amazon.com/neptune/latest/userguide/parameters.html#parameters-db-cluster-parameters-neptune_query_timeout

Setting this parameter will require a restart. If set at the cluster level, all instances in the cluster will need to be restarted. If set at the instance level, that instance will need to be restarted.

A custom timeout setting at the cluster level will override the default timeout for any instances within the cluster.

A custom instance timeout setting will override both default and custom settings that are set at the cluster level.