I'm loading RDF data into a JUNG graph to do some analysis. So I create a new graph with:
DirectedGraph g = new DirectedSparseGraph<String,GraphLink>();
I created a support class for specifying the link:
public class GraphLink {
String uri;
Float weight;
}
Then I populate it like this:
for each rdf triple <s,p,o>{
g.addVertex( s )
g.addVertex( o )
GraphLink link = new GraphLink()
link.uri = pred
link.weight = some weight;
g.addEdge( link, s, o )
}
Is this an efficient way of doing it or there are better ways? The representation of the edges is very counterintuitive, but if I do:
g.addEdge( p, s, o )
I get an exception of duplicated edge.
Any hints?
UPDATE: this code seems to work well:
DirectedGraph<RDFNode,Statement> g = new DirectedSparseGraph<RDFNode,Statement>()
// list all statements
// TODO: pagination for very large graphs.
assert m.size() < 10000000,"graph is too large."
m.listStatements().each{ stm->
RDFNode sub = stm.getSubject()
RDFNode obj = stm.getObject()
g.addVertex( sub )
if ( includeLiterals || !obj.isLiteral() ){
g.addVertex( obj )
g.addEdge( stm, sub, obj, EdgeType.DIRECTED )
}
}
Mulone
This may not be what you want at all, but you could try JenaJung, which presents a jena model as a Jung graph.
From the README file:
Model model = FileManager.get().loadModel("http://example.com/data.rdf");
Graph<RDFNode, Statement> g = new JenaJungGraph(model);
Layout<RDFNode, Statement> layout = new FRLayout(g);
layout.setSize(new Dimension(300, 300));
BasicVisualizationServer<RDFNode, Statement> viz =
new BasicVisualizationServer<RDFNode, Statement>(layout);