I'm trying to get data from AWS Neptune using Apache TinkerPop Gremlin client with RemoteConnection
. It seems like next()
of GraphTraversal
fetches data from LinkedBlockingQueue
of ResultQueue
and another thread of network call, enqueues the data into LinkedBlockingQueue
. So even if I don't consume the complete data from the traversal, it will keep fetching the complete result of the query. This answer suggests that how the Gremlin console next()
can help in getting the required data rather than loading the complete data in the memory. But that's not the case here. I tried to explore the batchSize
parameter of RequestOption
but that also does not help.
Is there any way to load only the required data in memory in batches (using next(size)
) rather than loading all the data in the memory at once using Java client? The basic example to connect to Neptune DB using a Java client is here.
The way that the Apache TinkerPop Gremlin Java client works, behind the scenes, is to continue receiving data from the server as fast as possible into a local buffer, even if you say something like next(1)
. This is done as the entire result needs to be sent to the client before the server side timeout triggers if possible. If you are using Amazon Neptune, a new server side Result Cache feature was just introduced which will allow you to do true server side caching and pagination. See this link for details