I am trying to ingest data using load csv command in neo4j 4.0.4 enterprise edition CSV file has 15k rows and 9 columns below is my load csv command:
USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///data/csv_files/NA/admin_2020-05-18_10-19-23_481000/Relationships.csv"
AS row WITH row WHERE row.`RELATIONSHIP_NAME` IS NOT NULL OR row.`ASSET_1_ID` IS NOT NULL OR row.`ASSET_2_ID` IS NOT NULL
MERGE (relationship:Relationship{ name:row.`RELATIONSHIP_NAME`, asset2_id:row.`ASSET_2_ID`, asset1_id:row.`ASSET_1_ID` })
SET relationship += { name:row.`RELATIONSHIP_NAME`, asset2_id:row.`ASSET_2_ID`, asset1_id:row.`ASSET_1_ID` , relationship_status:row.`RELATIONSHIP_STATUS`, relationship_type_id:row.`RELATIONSHIP_TYPE_ID`, relationship_type:row.`RELATIONSHIP_TYPE`, asset1_type:row.`ASSET_1_TYPE`, asset1_id:row.`ASSET_1_ID`, asset2_type:row.`ASSET_2_TYPE`, asset2_id:row.`ASSET_2_ID`, relationship_comments:row.`RELATIONSHIP_COMMENTS`, modified_at: TIMESTAMP() }
and there are no relationships that are being created but it took around 7 mins to complete process and also I did explain query which has no eager operations.
dbms.memory.pagecache.size=12100m
dbms.memory.heap.max_size=11900m
dbms.memory.heap.initial_size=11900m
Am I missing something here or is there a way I can optimise the query.How would i get to improve the performance? Thanks in advance
Your problem is probably in merging by too? many properties at once. It is recommended to merge on a single property and then add additional properties. Another thing is to add indexes on the properties you will be using to MERGE
on.
Your query looks weird though. It looks like you are setting the same properties as you are MERGE-ing on. Without any additional information, I would change your query to:
USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///data/csv_files/NA/admin_2020-05-18_10-19-23_481000/Relationships.csv"
AS row WITH row WHERE row.`RELATIONSHIP_NAME` IS NOT NULL OR row.`ASSET_1_ID` IS NOT NULL OR row.`ASSET_2_ID` IS NOT NULL
CREATE (relationship:Relationship)
SET relationship += { name:row.`RELATIONSHIP_NAME`, asset2_id:row.`ASSET_2_ID`, asset1_id:row.`ASSET_1_ID` , relationship_status:row.`RELATIONSHIP_STATUS`, relationship_type_id:row.`RELATIONSHIP_TYPE_ID`, relationship_type:row.`RELATIONSHIP_TYPE`, asset1_type:row.`ASSET_1_TYPE`, asset1_id:row.`ASSET_1_ID`, asset2_type:row.`ASSET_2_TYPE`, asset2_id:row.`ASSET_2_ID`, relationship_comments:row.`RELATIONSHIP_COMMENTS`, modified_at: TIMESTAMP() }