Search code examples
pythonneo4jpy2neo

How to get all nodes in a disjointed sub-graph - neo4j / py2neo


If I have a neo4j database and I want to query to get all nodes that are in a one specific disjointed sub-graph. (in py2neo or in cypher)

If I have groups of nodes, the nodes in each group are connected by relationship within that group, but are not connects between groups. Can I query for one node and get all the nodes in that node's group?


Solution

  • [UPDATED]

    Original Answer

    If by "group of nodes" you mean a "disjoint subgraph", here is how you can get all the nodes in the disjoint subgraph (with relationships of any type) that contains a specific node (say, the Neo node):

    MATCH (n { name: "Neo" })
    OPTIONAL MATCH p=(n)-[*]-(m)
    RETURN REDUCE(s = [n], x IN COLLECT(NODES(p)) |
      REDUCE(t = s, y IN x | CASE WHEN y IN t THEN t ELSE t + y END )) AS nodes;
    

    This query uses an OPTIONAL MATCH to find the nodes "related" to the Neo node, so that if that node has no relationships, the query would still be able to return a result.

    The two (nested) REDUCE clauses make sure that the returned collection only has distinct nodes.

    The outer REDUCE clause initialized the result collection with the n node, since it must always be in the disjoint subgraph, even if there are no other nodes.

    Better Answer

    MATCH p=(n { name: "Neo" })-[r*0..]-(m)
    WITH NODES(p) AS nodes
    UNWIND nodes AS node
    RETURN DISTINCT node
    

    This simpler query (which returns node rows) uses [r*0..] to allow 0-length paths (i.e., n does not need to have any relationships -- and m can be the same as n). It uses UNWIND to turn the nodes node collection(s) into rows, and then uses DISTINCT to eliminate duplicates.

    Alternate Solution (does not work yet, due to bug)

    This alternate solution below (which returns node rows) should also have worked, except that there is currently a bug (that I just reported) that causes all identifiers to be forgotten after a query unwinds a NULL (which can happen, for instance, if an OPTIONAL MATCH fails to find a match). Due to this bug, if the Neo node has no relationships, the query below currently returns no results. Therefore, you have to use the query above until the bug is fixed.

    MATCH (n { name: "Neo" })
    OPTIONAL MATCH p=(n)-[*]-(m)
    WITH n, NODES(p) AS nodes
    UNWIND nodes AS node
    RETURN DISTINCT (
      CASE
      WHEN node IS NULL THEN n
      ELSE node END ) AS res;