gremlin tinkerpop tinkerpop3 gremlin-server

idleConnectionTimeout in Gremlin driver

I have an application running on my localhost (which the gremlin driver runs on), a gremlin server running on the remote host, and a load balancer in between.

I have setup the keepAliveInterval of the Gremlin driver to keep the connection between my localhost and the load balancer open, but somehow the connection still get dropped. (The timeout interval in the load balancer is larger than the keepAliveInterval I configured in Gremlin driver).

I checked the log and find out that after connection get dropped, the Gremlin driver is still sending keep alive message to the load balancer, but it didn't get any response and it couldn't detect it as well.

I'm wondering are there any ways I can find out the keep alive response isn't been received or are there some configurations I can do like the IdleConnectionTimeout in the Gremlin server to let the gremlin driver realize the keep alive request is not getting any response?

Solution

The driver's keep-alive process is not terribly robust as it's a bit of a redundancy for the server-side keep-alive. The driver swallows exceptions related to the sending of it's own keep-alive messages and will only log a WARN message, so you would have to find the message in the log to determine if there was failure.

I'm not sure what gets your environment into a state where the load balancer loses the connection. Both the server and driver work together to maintain the connection. When the driver is idle it will send a keep-alive after the keepAliveInterval is exceeded which should reset idle state for both driver and server. On the server, the read and write idle state is monitored (by idleConnectionTimeout and keepAliveInterval respectively) and will send a keep-alive to the client. From either end of the "ping" (server or client) a "pong" should be returned. I wonder if there is a state of the load balancer that kills the connection if the "ping" doesn't come from the client-side?