To import XML data into a neo4j DB I first parse the XML to a python dictionary and then use CYPHER queries like this: (The pmid has a UNIQUE CONSTRAINT)
WITH $pubmed_dict as pubmed_article
UNWIND pubmed_article as particle
MERGE (p:Publication {pmid: particle.MedlineCitation.PMID.text})
ON CREATE SET p.title = COALESCE (particle.MedlineCitation.Article.Journal.Title, particle.MedlineCitation.Article.ArticleTitle)
ON MATCH SET p.title = COALESCE (particle.MedlineCitation.Article.Journal.Title, particle.MedlineCitation.Article.ArticleTitle)
FOREACH (author IN particle.MedlineCitation.Article.AuthorList.Author |
MERGE (a:Author {last_name: COALESCE(author.LastName, 'LAST NAME MISSING!'), first_name: COALESCE(author.ForeName, 'FIRST NAME MISSING!')})
MERGE (p)<-[:WROTE]-(a)
)
FOREACH (ref IN particle.MedlineCitation.CommentsCorrectionsList.CommentsCorrections |
MERGE (cited_p:Publication {pmid: COALESCE (ref.PMID.text, 'NO-PMID')})
MERGE (cited_p)<-[:REFERENCES]-(p)
)
My particle has the following dictionary structure:
What I want to achieve in the second FOREACH loop is:
IF there is a particle.MedlineCitation.CommentsCorrectionsList.CommentsCorrections
list
AND IF it has a map with PMID.text
,
I want that nothing happens if a publication with the given PMID already exists and I want otherwise that new nodes are created with the given PMID. In both cases I want that the node gets a relationship REFERENCES from the initially created p:Publication at the start of the query.
I have not found the syntax for such a case yet and the only workaround so far using the function COALESCE (ref.PMID.text, 'NO-PMID')})
always creates new :REFERENCES relationships to nodes which have a CommentsCorrectionsList without a PMID.
Who has a better solution?
You can use CASE WHEN THEN
to create an (non)-empty collection and use FOREACH on that, which is basically like an conditionally executed block.
e.g.
FOREACH (_ IN case when exists(p.foo) then [true] else [] end |
....
)