Search code examples
gremlintinkerpopamazon-neptune

Tinkerpop Neptune Config - Running In Docker


Does anyone have a gremlin-config.yaml file that would more accurately reflect how Gremlin acts in Neptune?

I am trying to run as much as I can using a local docker container, and I've substituted properties like gremlin.tinkergraph.vertexIdManager=ANY so that the Vertex IDs can be strings. But I'm still missing details like multiple labels, which I think is only available via the Neo4Js config, unsure what else this will change.

But yeah, generally looking for a config that represents how Neptune functions as closely as possible

Current:

host: 172.17.0.2
port: 8182
evaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf_local/tinkergraph-custom.properties}
scriptEngines: {
  gremlin-groovy: {
    plugins: { org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}}}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}        # application/json
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1 }                                                                                                           # application/vnd.graphbinary-v1.0
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }}                                                                 # application/vnd.graphbinary-v1.0-stringd
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000}}
strictTransactionManagement: false
idleConnectionTimeout: 0
keepAliveInterval: 0
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 10485760
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false}
gremlin.tinkergraph.vertexIdManager=ANY
gremlin.graph=org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
version: '3'
services:
  gremlin-server:
    container_name: gwent_onboarding_neptune
    image: tinkerpop/gremlin-server:3.5.0
    user: $USER
    volumes:
      - ./configuration/gremlin-conf:/opt/gremlin-server/conf_local
      # - ./gremlin-console/data:/gremlin-server/scripts
    ports:
      - 8182:8182
    command: ./conf_local/gremlin-serverr.yaml

Edit: Started down this rabbit hole when I was trying to get tinkerpop to work with multiple labels


Solution

  • It looks as if you have already done the things I usually recommend (such as enabling and using strings for the ID values). Given TinkerGraph does not support transactions currently, to simulate those you would likely need to consider using something like JanusGraph running in "inmemory" mode via a Gremlin Server. For the remaining differences, at the present time, it's mostly a case of avoiding any unsupported features such as meta properties in the code and queries that you write. Other than multiple labels as you mentioned, you should be able to do a lot of development and testing of queries locally but, of course, you will at some point still need to test against an actual Neptune cluster to verify the workloads that you have behave as expected when using the full Neptune cluster architecture.