Search code examples
amazon-web-servicesaws-lambdagremlintinkerpop3

AWS Lambda + Tinkerpop/Gremlin + TitanDB on EC2 + AWS DynamoDB in cloud


I am trying to execute following flow:

  1. user hits AWS Gateway (REST),

  2. it triggers AWS Lambda,

  3. that uses Tinkerpop/Gremlin connects to

  4. TitanDB on EC2, that uses

  5. AWS DynamoDB in cloud (not on EC2) as backend.

Right now I have managed to crete fully working TitanDB instance on EC2, that stores data in DynamoDB in cloud. I am also able to connect from AWS Lambda to EC2 through Tinkerpop/Gremlin BUT only this way:

Cluster.build()
       .addContactPoint("10.x.x.x") // ip of EC2
       .create()
       .connect()
       .submit("here I type my query as string and it will work");

And this works, however I strongly prefer to use "Criteria API" (GremlinPipeline) instead of plain Gremlin language. In other words, I need ORM or something like that. I know, that Tinkerpop includes it. I have realized, that what I need is object of class Graph. This is what I have tried:

Graph graph = TitanFactory
            .build()
            .set("storage.hostname", "10.x.x.x")
            .set("storage.backend", "com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager")
            .set("storage.dynamodb.client.credentials.class-name", "com.amazonaws.auth.DefaultAWSCredentialsProviderChain")
            .set("storage.dynamodb.client.credentials.constructor-args", "")
            .set("storage.dynamodb.client.endpoint", "https://dynamodb.ap-southeast-2.amazonaws.com")
            .open();

However, it throws "Could not find implementation class: com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager". Of course, computer is correct, as IntelliJ IDEA also cannot find it.

My dependencies:

//
// aws
compile 'com.amazonaws:aws-lambda-java-core:+'
compile 'com.amazonaws:aws-lambda-java-events:+'
compile 'com.amazonaws:aws-lambda-java-log4j:+'
compile 'com.amazonaws:aws-java-sdk-dynamodb:1.10.5.1'
compile 'com.amazonaws:aws-java-sdk-ec2:+'
//
// database
// titan 1.0.0 is compatible with gremlin 3.0.2-incubating, but not yet with 3.2.0
compile 'com.thinkaurelius.titan:titan-core:1.0.0'
compile 'org.apache.tinkerpop:gremlin-core:3.0.2-incubating'
compile 'org.apache.tinkerpop:gremlin-driver:3.0.2-incubating'

What is my goal: have fully working Graph object

What is my problem: I don't have DynamoDBStoreManager class, and I do not know what dependency I have to add.

My additional question is: why connecting through Cluster class requires only IP and works, but TitanFactory requires properties like those I have used on gremlin-server on EC2? I do not want to create second server, I just want to connect as client to it and take Graph object.

EDIT: After adding resolver, it builds, in output I get multiple:

13689 [TitanID(0)(4)[0]] WARN com.thinkaurelius.titan.diskstorage.idmanagement.ConsistentKeyIDAuthority - Temporary storage exception while acquiring id block - retrying in PT2.4S: com.thinkaurelius.titan.diskstorage.TemporaryBackendException: Wrote claim for id block [1, 51) in PT0.342S => too slow, threshold is: PT0.3S

and execution hangs on open() method, so does not allow me to execute any queries.


Solution

  • For the DynamoDBStoreManager class, you would need this dependency:

    compile 'com.amazonaws:dynamodb-titan100-storage-backend:1.0.0'
    

    Then for the DynamoDBLocal issue, try adding this resolver:

    resolvers += "AWS DynamoDB Local Release Repository" at "http://dynamodb-local.s3-website-us-west-2.amazonaws.com/release"
    

    I'm not entirely clear on what this means -- "Criteria API" instead of plain Gremlin language. I'm guessing that you mean that you want to interact with the graph using Java rather than passing Gremlin as a string over to a running Titan/Gremlin Server? If this is the case, then you don't need to start a Titan/Gremlin Server at all (step 4 above). Write an AWS Lambda program (step 2-3 above) that creates a direct Titan client connection via TitanFactory, where all of the Titan configuration properties are for your DynamoDB instance (step 5 above).