Search code examples
neo4jcypherneo4j-apoc

Cypher query to find all relationship among multiple nodes


I have a complex graph as shown in image below :

enter image description here

Here every relationship has a type value. I need to write a cypher query to find all relationships (with their type values) among given set of nodes (two or more). The nodes can be entered in any order, like x64->Linux->Oracle or Oracle->Linux->10.2.

EDIT

I am expecting output like this. All the combination of nodes with relationship name that links them.

  1. For input : x64->Linux->Oracle

enter image description here

  1. For input : Linux->64->Oracle->12c

enter image description here

DATA

Data can be accessed from https://www.dropbox.com/s/w28omkdrgmhv7ud/Neo4j%20Queries.txt?dl=0

EDIT New Output format for input x64->Linux->Oracle

enter image description here


Solution

  • Provided you're looking for only relationships directly connecting each pair of nodes in the set (as opposed to finding all multi-hop paths between pairs of nodes in the set), APOC Procedures has apoc.algo.cover() for exactly this use case:

    ...
    // assume `nodes` is the collection of nodes
    CALL apoc.algo.cover(nodes) YIELD rel
    RETURN rel
    

    EDIT

    As mentioned in my comment, your change to the requirements drastically changes the nature of the question.

    You seem to want complete path results (directed), including nodes that are not in your input, and you want to ensure that the same type attribute is present for all relationships in the path.

    This requires that we find the ordering of those nodes so we can identify a path between all of them. While we could find all possible permutations of the input nodes (for the order of traversal for the paths), I think we can get away with just finding permutations of 2 for the start and end nodes (by UNWINDING the collection twice and removing rows where the start and end node are the same). We'll want to first find all the input and output relationship types so we can use some set operations (output types of the start node intersected with input types of the end node intersected with all the (intersected) input and output types of the other nodes) to find the potential types that can possibly be present on relationships that can connect all the nodes.

    From the remaining rows after this filtering we can match to the variable-length path that can connect all of those nodes, using only the types provided so that each path only traverses traverses relationships with a single type. Afterwards we filter to ensure that all the input nodes are in the path.

    We'll assume the nodes are of type :Node with the property 'name'.

    MATCH (entity:Entity) 
    WHERE entity.key in ['Product','Version','BinaryType'] AND entity.value in ['pc','10.2','64']
    WITH collect(entity) as nodes
    UNWIND nodes as node
    WITH nodes, node, [()-[r]->(node) | r.type] as inputTypes, [(node)-[r]->() | r.type] as outputTypes
    WITH nodes, node, apoc.coll.toSet(inputTypes) as inputTypes, apoc.coll.toSet(outputTypes) as outputTypes
    WITH nodes, collect({node:node, inputTypes:inputTypes, outputTypes:outputTypes}) as nodeData
    UNWIND nodeData as start
    UNWIND nodeData as end
    WITH nodes, start, end, nodeData
    WHERE start <> end
    WITH nodes, start, end, apoc.coll.subtract(nodeData, [start, end]) as theRest
    WITH nodes, start.node as start, end.node as end, apoc.coll.intersection(start.outputTypes, end.inputTypes) as possibleTypes, [data in theRest | apoc.coll.intersection(data.inputTypes, data.outputTypes)] as otherTypes
    WITH nodes, start, end, reduce(possibleTypes = possibleTypes, types in otherTypes | apoc.coll.intersection(possibleTypes, types)) as possibleTypes
    WHERE size(possibleTypes) > 0
    UNWIND possibleTypes as type
    MATCH path = (start)-[*]->(end)
    WHERE all(rel in relationships(path) WHERE rel.type = type) 
     AND length(path) >= size(nodes) - 1 
     AND all(node in nodes WHERE node in nodes(path))
    RETURN nodes(path) as pathNodes, type
    

    To do this with both type and level, we need to collect both of them earlier in the query, so instead of dealing with just a type, we're dealing with a map of both the type and the level. This does make the query a bit more complex, but it's necessary to ensure that the paths provided have the same type and level for all relationships in the path.

    MATCH (entity:Entity) 
    WHERE entity.key in ['Product','Version','BinaryType'] AND entity.value in ['pc','10.2','64']
    WITH collect(entity) as nodes
    UNWIND nodes as node
    WITH nodes, node, [()-[r]->(node) | {type:r.type, level:r.level}] as inputs, [(node)-[r]->() | {type:r.type, level:r.level}] as outputs
    WITH nodes, collect({node:node, inputs:apoc.coll.toSet(inputs), outputs:apoc.coll.toSet(outputs)}) as nodeData
    UNWIND nodeData as start
    UNWIND nodeData as end
    WITH nodes, start, end, nodeData
    WHERE start <> end
    WITH nodes, start, end, apoc.coll.subtract(nodeData, [start, end]) as theRest
    WITH nodes, start.node as start, end.node as end, apoc.coll.intersection(start.outputs, end.inputs) as possibles, [data in theRest | apoc.coll.intersection(data.inputs, data.outputs)] as others
    WITH nodes, start, end, reduce(possibles = possibles, data in others | apoc.coll.intersection(possibles, data)) as possibles
    WHERE size(possibles) > 0
    UNWIND possibles as typeAndLevel
    MATCH path = (start)-[*]->(end)
    WHERE all(rel in relationships(path) WHERE rel.type = typeAndLevel.type AND rel.level = typeAndLevel.level) 
     AND length(path) >= size(nodes) - 1 
     AND all(node in nodes WHERE node in nodes(path))
    RETURN nodes(path) as pathNodes, typeAndLevel.type as type, typeAndLevel.level as level