Search code examples
neo4jcypher

Cypher query to verify all items in an ordered list are represented in order by a nodes in the graph with specified relationship


I am modeling a simple file system, with folders and files.

(:Folder)<-[:IS_IN *]-(:Folder)<-[:IS_IN]-(:File)

A user can give me an arbitrary path, /foo/bar/baz.txt. In this case the graph would look like:

(:Folder {Name: '/'})<-[:IS_IN]-(:Folder {Name: 'Foo'})<-[:IS_IN]-(:Folder {Name: 'bar'})<-[:IS_IN]-(:File {Name: 'baz.txt'})

Without knowing how long the path a user will supply, is there a good way to query whether a given path is represented in the graph? If I split path into an ordered list of folder names, can I write a query that returns the nodes if and only if they exist in the same order as the list?

Naively I can dynamically generate a query with each node, but I'd rather not generate dynamic queries.


Solution

  • Assuming that you pass this $path parameter, and that the last element in the path is a file:

    "/foo/bar/baz.txt"
    

    And assuming that your data looks like this:

    (:Folder {Name: '/'})<-[:IS_IN]-
      (:Folder {Name: 'foo'})<-[:IS_IN]-
        (:Folder {Name: 'bar'})<-[:IS_IN]-
          (:File {Name: 'baz.txt'})
    

    This should work:

    WITH SPLIT($path, '/') AS p
    WITH REVERSE(CASE WHEN p[0] = '' THEN ['/']+p[1..] ELSE p END) AS names
    OPTIONAL MATCH (file:File {Name: names[0]})
    RETURN REDUCE(s=file, n IN names[1..] |
      CASE WHEN s IS NOT NULL THEN [(s)-[:IS_IN]->(f1 {Name: n})|f1][0] END
    ) IS NOT NULL AS found
    

    The query will return whether the path in $path is found.

    It breaks up the $path string into an array of folder and file names, reverses it, and then optionally matches the file node. We use an OPTIONAL MATCH so that the query always returns a value, even when the file is not found. Then we use the REDUCE function to loop through the remaining folders in the path to verify that each needed relationship exists. We use a pattern comprehension to perform matches within the loop.