Search code examples
javaneo4jsubgraph

Induced subgraphs in neo4j


I have a graph in neo4j, and for a given node N, I would like to find all nodes that are reachable in a path of no longer than P steps from N, as well as all links between that set of nodes. It seems like this could be possible with either Cypher or the Traversal framework; is one preferred over the other? I'm doing this from Java with an embedded database, and I will need to perform further queries on the subgraph. I've poked around and haven't found any conclusive answers.


Solution

  • I think cypher is the most concise way of getting your desired data, querying for variable length paths, some collecting and refining:

    If n is the internal id of your node N and your P is 5:

    START begin = node(n)             // or e.g. index lookup
    MATCH p = (begin)<-[r*..5]-(end)  // match all paths of length up to 5
    WITH distinct nodes(p) as nodes   // collect the nodes contained in the paths
    MATCH (x)<-[r]-(y)                // find all relationships between nodes
    WHERE x in nodes and y in nodes   // which were found earlier
    RETURN distinct x,r,y             // and deduplicate as you find all pairs twice
    

    It might not be the most efficient way, but at least the execution plan explanation in http://console.neo4j.org/ suggests that the y in nodes is considered before the MATCH (x)-[r]-(y).

    I couldn't think of a way to avoid matching the relationships twice, therefore the distinct in the return statement.