Search code examples
dockerdocker-composegremlingremlin-servertinkergraph

Storing graph data with TinkerPop Gremlin run in Docker


I'm trying to use Gremlin server to work with graph based database and docker-compose. The problem is that when I shutdown or restart the container, no static file is saved and the graph is empty with the following start.

What am I doing wrong? :(

What I have done is:

  • Set this image for the container image: tinkerpop/gremlin-server

  • Set the container volumes

volumes:
  - gremlin_data:/opt/gremlin-server/data
  • Set the docker-compose volumes:
volumes:
  pgdata:
  django-static:
  gremlin_data:
  • Set the tinkergraph in order to save the graph state on close:
gremlin.tinkergraph.graphLocation=/opt/gremlin-server/data/graph.kryo
gremlin.tinkergraph.graphFormat=gryo


Solution

  • The issue appears to be that gremlin-server does not get shut down gracefully when running in docker. TinkerGraph is primarily an in-memory graph and it only saves to the location defined in gremlin.tinkergraph.graphLocation when the graph instance is closed. When gremlin-server is sent a SIGINT, it will close the underlying TinkerGraph and the data will be saved.

    The issue is that when running the distributed gremlin-server docker image, gremlin-server is not the foreground process in the container. When the container is shut down, the server is never signaled to shut down gracefully, and this save does not take place.

    I have tested a setup similar to what you described; if I simply shut down the container, my graph is not saved. However if I open a shell in the container and run the following to send a SIGINT to the server before shutdown, my data is preserved in my volume.

    /opt/gremlin-server $ ps
    PID   USER     TIME  COMMAND
        1 gremlin   0:00 {gremlin-server.} /bin/bash /opt/gremlin-server/bin/gremlin-server.sh conf/gremlin-server.yaml
       22 gremlin   0:06 java -Dlogback.configurationFile=file:/opt/gremlin-server/conf/logback.xml -Xms512m -Xmx4096m -cp :/opt/gremlin-server/conf/:/opt/gremlin-server/lib
       55 gremlin   0:00 /bin/sh
       61 gremlin   0:00 ps
    /opt/gremlin-server $ kill -INT 22 # PID of the java process (gremlin-server)
    

    I believe this is something which should be resolved within TinkerPop. I have created a TinkerPop JIRA for such an improvement.

    EDIT:

    Stephen Mallette from the TinkerPop community connected me to an old related JIRA which has a much more elegant workaround. If you send graph.close() as a gremlin script before shutting down the container, the graph will gracefully close and save to your volume. If you are connecting via gremlin-console, you can simply run the command.

    gremlin> graph.close()
    ==>null
    

    If you are connecting through the java driver or one of the other GLV's, you will have to submit the command as a script:

    client.submit("graph.close()");