I am using Titan 1.0.0 and Cassandra as a backend.
I have location data (latitude, longitude) as a nodes and edges between these nodes. I want to find shortest path from node A to node B. The graph size is very large. Currently I am using this query to find path between two nodes.
g.V(fromNode).repeat(both().simplePath()).until(is(toNode)).limit(1).path().fill(list);
This query is very inefficient and gives memory error for path size more than 10. After reading about shortest path algorithms, I got to know that implementing A* will be more feasible than implementing Dijkstra's as in A* less graph will be explored.
Now I can use JUNG and load TitanGraph into the memory and implement my own A* to get shortest path. But I don't want whole graph to be inside memory.
Kindly suggest me on how should I implement A* on TitanGraph. Should I query graph using API or should I load graph into memory first then run the A* on in-memory graph.
I will be thankful for any other suggestions as well.
First, this is a question more associated with Tinkerpop3 rather than Titan. I agree with you that loading the graph in memory is a totally bad idea.
The best way is to implement the A star algorithm to run straight on the Gremlin console.
The Gremlin console executes Groovy code. You could implement the A* algorithm on a .groovy
file and load it to Gremlin console this way:
gremlin> :load my_folder/a-star-algo-tp3.groovy
Let's say you defined a method inside this file called:
List<Element> a_star_algo(Graph graph, Object fromId, Object toId)
You could just call it from your gremlin console this way:
list = a_star_algo(graph, v1.id(), v2.id())
You may implement this method using A* algorithm. This repo implements the algorithm using their own data structure in Groovy. I'm not sure how efficient their implementation is but you may have a look.