Search code examples
csvneo4jcypherload-csv

Multiple LOAD CSV statements in one Cypher query


Trying to import rows and create nodes from different .csv files in one cypher query:

// User nodes
LOAD CSV WITH HEADERS
FROM 'file:///profile.csv' AS profile_line
CREATE (user:User { userId: profile_line.User })

// Project nodes
LOAD CSV WITH HEADERS
FROM 'file:///project.csv' AS project_line
CREATE (project:Project { projectId: project_line.projectId })

// Image nodes
LOAD CSV WITH HEADERS
FROM 'file:///media.csv' AS image_line
CREATE (image:Image { imageId: '<imageId>' })

Throws the following error:

"WITH is required between CREATE and LOAD CSV (line 9, column 1 (offset: 211)) "CREATE (project:Project { projectId: project_line.projectId })"

I am unclear as to how the WITH statement should be constructed.


Solution

  • If you're using the Neo4j Browser, the easiest way to do this is to just separate your statements with semicolons and turn on the 'multi-statement query editor' (which also gives you a nice little progress indicator as each statement is run):

    enter image description here

    LOAD CSV WITH HEADERS
    FROM 'file:///profile.csv' as profile_line
    CREATE (user: User { userId: profile_line.User });
    
    LOAD CSV WITH HEADERS
    FROM 'file:///project.csv' as project_line
    CREATE (project: Project { projectId: project_line.projectId });
    
    LOAD CSV WITH HEADERS
    FROM 'file:///media.csv' as image_line
    CREATE (image: Image { imageId: image_line.ImageId });
    

    enter image description here

    Otherwise, it is still possible to do this. What we want is a WITH statement that will return a single row irrespective of how many nodes the previous CREATE ended up creating, so any aggregation function will do. For example:

    LOAD CSV WITH HEADERS
    FROM 'file:///profile.csv' as profile_line
    CREATE (user: User { userId: profile_line.User })
    
    WITH max(1) as dummy
    LOAD CSV WITH HEADERS
    FROM 'file:///project.csv' as project_line
    CREATE (project: Project { projectId: project_line.projectId })
    
    WITH max(1) as dummy
    LOAD CSV WITH HEADERS
    FROM 'file:///media.csv' as image_line
    CREATE (image: Image { imageId: image_line.ImageId })
    
    Added 9 labels, created 9 nodes, set 9 properties, completed after 17 ms.
    

    Though I think this is super unclear for future-you, and I wouldn't recommend it.