Search code examples
neo4jnullcypherneo4j-apocfailonerror

Neo4J Cypher - Delete existing Node if Json source file no longer found


I currently load a directory of json files in to my neo4j database each day.

If a file ID already exists in the database I do nothing, if there are new file IDs I create new nodes, if the filename ID no longer exists in the directory then I would like to delete the node with the matching ID.

With apoc.load.json I am using failOnError:false so that the script doesn't fail if any of the files no longer exist and therefore cannot be loaded.

I have attempted various examples to pass on the ID of the missing file ie where it has returned a null, but so far the best I have come up with is the below which still does not delete the required node as when the error occurs it moves on to the next file to load. This is a snippet, there is further code for creation that occurs after this point:

UNWIND range(1,100) as id
WITH DISTINCT id + ".json" as file,id
CALL apoc.load.json("file:///output/"+file, null, {failOnError:false})
YIELD value
WITH value,id
CALL apoc.do.when(value is not null, "RETURN value",
"MATCH (n:File {FileId: id}) DETACH DELETE n", {value:value, id:id}) YIELD value as v
RETURN v

Is there a way to capture the failOnError and process Cypher at that point to be able to delete the required nodes?

Similar query here: https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/1149 I have also attempted the collect/coalesce combination to return something to work on but this also does not work.

Thank you for any assistance!


Solution

  • I've tested the code, and there seems to be no way to get an empty map when the load.json procedure fails. Perhaps open another feature request and explain your problem. In the meantime, I would suggest to do the following as a workaround.

    First, add a secondary label to all File nodes.

    MATCH (f:File)
    SET f:Delete
    

    Then run your code and update the files however you want and in the end remove the secondary label from File nodes that have a source file:

    UNWIND range(1,100) as id
    WITH DISTINCT id + ".json" as file,id
    CALL apoc.load.json("file:///output/"+file, null, {failOnError:false})
    YIELD value
    WITH value,id
    MATCH (n:File {FileId: id})
    REMOVE n:Delete
    do some updates etc...
    

    And then once that is finished, delete all the nodes that weren't matched during the process

    MATCH (d:Delete)
    DETACH DELETE d