Search code examples
cassandragraph-databasestitanrexster

How configure multiple cassandra nodes as storage.backend in Rexster config?


I have Titan/Rexter running on machine and a 3-node cluster of Cassandra as storage.backend for the Titan graph DB. I want to configure Rexster, so as to connect to all 3 nodes of Cassandra cluster. I have listed ip-addresses of all nodes of cassandra as comma-separated, as given below.

<graph>
    <graph-name>graph</graph-name>
    ...
    <properties>
        <storage.backend>cassandrathrift</storage.backend>
        <storage.hostname>10.240.182.197,10.240.166.40,10.240.78.153</storage.hostname>
        ...
        </properties>
</graph>

But it seems like, Rexster is connecting to only the first node only "10.240.182.197", means if I shutdown node - 10.240.182.197, Rexster is unable to connect to other nodes and which throws an exception

Rexster startup log

[INFO] RexsterApplicationGraph - Graph [graph] - configured with allowable namespace [tp:gremlin]
**[INFO] GraphConfigurationContainer - Graph graph - titangraph**[cassandrathrift:10.240.182.197]** loaded**
[INFO] RexsterApplicationGraph - Graph [tinkergraph] - configured with allowable namespace [tp:gremlin]
[INFO] GraphConfigurationContainer - Graph tinkergraph - tinkergraph[vertices:0 edges:0 directory:data/graph-example-1] loaded

[update] I changed the config from "cassandrathrift" to "cassandra" and now its able to connect to all nodes.

Now my question is why "cassandrathrift" API is not able to connect to other nodes? What is difference in using "cassandrathrift" and "cassandra" ? pros & cons? which one is faster in loading and retrieving data into graph?


Solution

  • "Cassandrathrift" adapter do not have the intelligence to load balance or node-discovery by itself. It always try to connect to the first listed host-ip, there's no load balancing, and when ip1 goes down Rexster stops. With the astyanax adapter, will get automatic ring discovery and fault detection. Set storage.backend as "cassandra" as given below.

    modified config (rexster.xml) :-

    <graph>
        <graph-name>graph</graph-name>
        ...
        <properties>
            <storage.backend>cassandra</storage.backend>
            <storage.hostname>10.240.182.197,10.240.166.40,10.240.78.153</storage.hostname>
            ...
            </properties>
    </graph>
    

    After this bounce the titan/rexster and it connected to all the nodes.

    Ref : Aurelius › Rexster/Titan-Cassandra high availability