Search code examples
titangremlin-servergremlinpython

Why can't I connect to Gremlin-Server?


Abstract

I'm trying to set up a Titan/Cassandra/Gremlin-Server stack in Docker (v1.13.0). The problem I'm facing is that applications trying to connect to Gremlin-Server on the default port 8182 are reporting errors (details below).

First, here is some relevant version information:

  • Cassandra v2.2.8
  • Titan v1.0.0 (Hadoop 1)
  • Gremlin 3.2.3

Setup

Setup takes place in a Dockerfile in order to be reproducible. It assumes that a Cassandra container already exists, running a cassandra.yaml in which start_rpc has been set to true.

The Dockerfile is as follows:

FROM openjdk:alpine

ENV TITAN 'titan-1.0.0-hadoop1'

RUN apk update && apk add bash unzip && rm -rf /var/cache/apk/* \
    && adduser -S -s /bin/bash -D srg \
    && wget -O /tmp/$TITAN.zip http://s3.thinkaurelius.com/downloads/titan/$TITAN.zip \
    && unzip /tmp/$TITAN.zip -d /opt && ln -s /opt/$TITAN /opt/titan \
    && rm /tmp/*.zip \
    && chown -R srg /opt/$TITAN/ \
    && /opt/titan/bin/gremlin-server.sh -i org.apache.tinkerpop gremlin-python 3.2.3

COPY conf/gremlin-server/* /opt/$TITAN/conf/gremlin-server/

USER srg
WORKDIR /opt/titan
EXPOSE 8182

CMD ["bin/gremlin-server.sh", "conf/gremlin-server/srg.yaml"]

The astute reader will note that I am copying custom configuration files into the container, namely a Gremlin-Server configuration file (srg.yaml) and a titan graph properties file (srg.properties).

srg.yaml

host: localhost
port: 8182
threadPoolWorker: 1
gremlinPool: 8
scriptEvaluationTimeout: 30000
serializedResponseTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf/gremlin-server/srg.properties
  }
plugins:
  - aurelius.titan
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]},
  gremlin-jython: {},
  gremlin-python: {},
  nashorn: {
      imports: [java.lang.Math],
      staticImports: [java.lang.Math.PI]}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false}

srg.properties

gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.backend=cassandrathrift
storage.hostname=cassandra  # refers to the linked container
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25

# Start elasticsearch inside the Titan JVM
index.search.backend=elasticsearch
index.search.directory=db/es
index.search.elasticsearch.client-only=false
index.search.elasticsearch.local-mode=true

Execution

The container is run with the following command: docker run -ti --rm=true --link test.cassandra:cassandra -p 8182:8182 titan.

Here is the log output for Gremlin-Server:

0    [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - 
         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----

297  [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Configuring Gremlin Server from conf/gremlin-server/srg.yaml
439  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics ConsoleReporter configured with report interval=180000ms
448  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics CsvReporter configured with report interval=180000ms to fileName=/tmp/gremlin-server-metrics.csv
557  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics JmxReporter configured with domain= and agentId=
561  [main] INFO  org.apache.tinkerpop.gremlin.server.util.MetricManager  - Configured Metrics Slf4jReporter configured with interval=180000ms and loggerName=org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics
1750 [main] INFO  com.thinkaurelius.titan.core.util.ReflectiveConfigOptionLoader  - Loaded and initialized config classes: 12 OK out of 12 attempts in PT0.148S
1972 [main] INFO  com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager  - Closed Thrift connection pooler.
1990 [main] INFO  com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration  - Generated unique-instance-id=ac1100031-ad2d5ffa52e81
2026 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Configuring index [search]
2386 [main] INFO  org.elasticsearch.node  - [Lunatik] version[1.5.1], pid[1], build[5e38401/2015-04-09T13:41:35Z]
2387 [main] INFO  org.elasticsearch.node  - [Lunatik] initializing ...
2399 [main] INFO  org.elasticsearch.plugins  - [Lunatik] loaded [], sites []
6471 [main] INFO  org.elasticsearch.node  - [Lunatik] initialized
6472 [main] INFO  org.elasticsearch.node  - [Lunatik] starting ...
6477 [main] INFO  org.elasticsearch.transport  - [Lunatik] bound_address {local[1]}, publish_address {local[1]}
6507 [main] INFO  org.elasticsearch.discovery  - [Lunatik] elasticsearch/u2StmRW1RsyEHw561yoNFw
6519 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.cluster.service  - [Lunatik] master {new [Lunatik][u2StmRW1RsyEHw561yoNFw][ad2d5ffa52e8][local[1]]{local=true}}, removed {[Lunatik][kKyL9UE-R123LLZTTrsVCw][ad2d5ffa52e8][local[1]]{local=true},}, reason: local-disco-initial_connect(master)
6908 [main] INFO  org.elasticsearch.http  - [Lunatik] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.3:9200]}
6909 [main] INFO  org.elasticsearch.node  - [Lunatik] started
6923 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.gateway  - [Lunatik] recovered [0] indices into cluster_state
7486 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO  org.elasticsearch.cluster.metadata  - [Lunatik] [titan] creating index, cause [api], templates [], shards [5]/[1], mappings []
8075 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Initiated backend operations thread pool of size 4
8241 [main] INFO  com.thinkaurelius.titan.diskstorage.Backend  - Configuring total store cache size: 94787290
8641 [main] INFO  com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog  - Loaded unidentified ReadMarker start time 2017-01-21T16:31:28.750Z into com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog$MessagePuller@3520958b
8642 [main] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Graph [graph] was successfully configured via [conf/gremlin-server/srg.properties].
8643 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - Initialized Gremlin thread pool.  Threads in pool named with pattern gremlin-*
14187 [main] INFO  com.jcabi.manifests.Manifests  - 108 attributes loaded from 264 stream(s) in 185ms, 108 saved, 3371 ignored: ["Agent-Class", "Ant-Version", "Archiver-Version", "Bnd-LastModified", "Boot-Class-Path", "Build-Date", "Build-Host", "Build-Id", "Build-Java-Version", "Build-Jdk", "Build-Job", "Build-Number", "Build-Time", "Build-Timestamp", "Build-Version", "Built-At", "Built-By", "Built-OS", "Built-On", "Built-Status", "Bundle-ActivationPolicy", "Bundle-Activator", "Bundle-BuddyPolicy", "Bundle-Category", "Bundle-ClassPath", "Bundle-Classpath", "Bundle-Copyright", "Bundle-Description", "Bundle-DocURL", "Bundle-License", "Bundle-Localization", "Bundle-ManifestVersion", "Bundle-Name", "Bundle-NativeCode", "Bundle-RequiredExecutionEnvironment", "Bundle-SymbolicName", "Bundle-Vendor", "Bundle-Version", "Can-Redefine-Classes", "Change", "Class-Path", "Created-By", "DynamicImport-Package", "Eclipse-AutoStart", "Eclipse-BuddyPolicy", "Eclipse-SourceReferences", "Embed-Dependency", "Embedded-Artifacts", "Export-Package", "Extension-Name", "Extension-name", "Fragment-Host", "Git-Commit-Branch", "Git-Commit-Date", "Git-Commit-Hash", "Git-Committer-Email", "Git-Committer-Name", "Gradle-Version", "Gremlin-Lib-Paths", "Gremlin-Plugin-Dependencies", "Gremlin-Plugin-Paths", "Ignore-Package", "Implementation-Build", "Implementation-Build-Date", "Implementation-Title", "Implementation-URL", "Implementation-Vendor", "Implementation-Vendor-Id", "Implementation-Version", "Import-Package", "Include-Resource", "JCabi-Build", "JCabi-Date", "JCabi-Version", "Java-Vendor", "Java-Version", "Main-Class", "Main-class", "Manifest-Version", "Maven-Version", "Module-Email", "Module-Origin", "Module-Owner", "Module-Source", "Originally-Created-By", "Os-Arch", "Os-Name", "Os-Version", "Package", "Premain-Class", "Private-Package", "Require-Bundle", "Require-Capability", "Scm-Connection", "Scm-Revision", "Scm-Url", "Specification-Title", "Specification-Vendor", "Specification-Version", "Tool", "X-Compile-Source-JDK", "X-Compile-Target-JDK", "hash", "implementation-version", "mode", "package", "url", "version"]
14842 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-jython ScriptEngine
15540 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded nashorn ScriptEngine
16076 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-python ScriptEngine
16553 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines  - Loaded gremlin-groovy ScriptEngine
17410 [main] INFO  org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor  - Initialized gremlin-groovy ScriptEngine with scripts/empty-sample.groovy
17410 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - Initialized GremlinExecutor and configured ScriptEngines.
17419 [main] INFO  org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor  - A GraphTraversalSource is now bound to [g] with graphtraversalsource[standardtitangraph[cassandrathrift:[cassandra]], standard]
17565 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
17566 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
17808 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/vnd.gremlin-v1.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0
17811 [main] INFO  org.apache.tinkerpop.gremlin.server.AbstractChannelizer  - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0
17958 [gremlin-server-boss-1] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Gremlin Server configured with worker thread pool of 1, gremlin pool of 8 and boss thread pool of 1.
17959 [gremlin-server-boss-1] INFO  org.apache.tinkerpop.gremlin.server.GremlinServer  - Channel started at port 8182.
1/21/17 4:34:20 PM =============================================================

-- Meters ----------------------------------------------------------------------
org.apache.tinkerpop.gremlin.server.GremlinServer.errors
             count = 0
         mean rate = 0.00 events/second
     1-minute rate = 0.00 events/second
     5-minute rate = 0.00 events/second
    15-minute rate = 0.00 events/second


180564 [metrics-logger-reporter-thread-1] INFO  org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics  - type=METER, name=org.apache.tinkerpop.gremlin.server.GremlinServer.errors, count=0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second

Symptoms

So far, everything appears to be working as intended. The logs indicate that I am able to load srg.properties and bind the data structure to a variable called graph.

The problem appears when I try to connect to the Gremlin-Server instance over the exported port 8182, for example using gremlin-python:

# executed via python 3.6.0 on the host machine, i.e. not inside of Docker
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection

g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','graph'))

produces the following exception ...

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-10-59ad504f29b4> in <module>()
----> 1 g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/','g'))

/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gremlin_python/driver/driver_remote_connection.py in __init__(self, url, traversal_source, username, password, loop, graphson_reader, graphson_writer)
     41         self._password = password
     42         if loop is None: self._loop = ioloop.IOLoop.current()
---> 43         self._websocket = self._loop.run_sync(lambda: websocket.websocket_connect(self.url))
     44         self._graphson_reader = graphson_reader or GraphSONReader()
     45         self._graphson_writer = graphson_writer or GraphSONWriter()

/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/ioloop.py in run_sync(self, func, timeout)
    455         if not future_cell[0].done():
    456             raise TimeoutError('Operation timed out after %s seconds' % timeout)
--> 457         return future_cell[0].result()
    458 
    459     def time(self):

/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/concurrent.py in result(self, timeout)
    235             return self._result
    236         if self._exc_info is not None:
--> 237             raise_exc_info(self._exc_info)
    238         self._check_done()
    239         return self._result

/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/util.py in raise_exc_info(exc_info)

HTTPError: HTTP 599: Stream closed

Suspecting a problem specific to this library:

1) attempt to connect to the websocket port with nc

$ nc -z -v localhost 8182
found 0 associations
found 1 connections:
     1: flags=82<CONNECTED,PREFERRED>
    outif lo0
    src ::1 port 58627
    dst ::1 port 8182
    rank info not available
    TCP aux info available

Connection to localhost port 8182 [tcp/*] succeeded!

2) attempt to connect to Gremlin-Server using a different client library, namely go-gremlin

Test case:

package main

import (
    "fmt"
    "log"

    "github.com/go-gremlin/gremlin"
)

func main() {
    if err := gremlin.NewCluster("ws://localhost:8182/gremlin"); err != nil {
        log.Fatal(err)
    }

    data, err := gremlin.Query(`graph.V()`).Exec()
    if err != nil {
        log.Fatalf("Query error: %s", err)
    }

    fmt.Println(string(data))
}

Output:

$ go run cmd/test/main.go 
2017/01/21 14:47:42 Query error: unexpected EOF
exit status 1

Current Conclusions & Questions

From the previous tests, I conclude that this is an application-level problem (i.e. a problem on the websocket or ws protocol level, not a problem in the host or container networking stack). Indeed, nc reports that the socket connection is successful, but in both the Python and Go client libraries ostensibly complain of an inappropriate (empty) response from the server.

I have tried removing the /gremlin path from the websocket URL both in gremlin-python and in go-gremlin, to no avail.

My question is: where do I go from here? Any suggestions or diagnostic paths would be most appreciated!


Solution

  • The main problem is that the host in your Gremlin Server configuration is set to the default which is localhost. This will only allow connections from the server itself. You need to change the value to an external IP of the server or 0.0.0.0.

    The other issue is that gremlin-python server plugin was made available with Apache TinkerPop 3.2.2. Titan 1.0.0 uses TinkerPop 3.0.1. I dobut that the gremlin-python 3.2.3 plugin will work with Titan 1.0.0.

    Update: Consider using JanusGraph 0.1.1 which uses TinkerPop 3.2.3. JanusGraph was forked from Titan, so the code is basically the same with updated dependencies.