Search code examples
neo4jmergecypherneo4j-apoc

Cypher merge nodes with same property and collected the other property


I have nodes with this structure

(g:Giocatore { nome, match, nazionale})

(nome:'Del Piero', match:'45343', nazionale:'ITA')
(nome:'Messi', match:'65324', nazionale:'ARG')
(nome:'Del Piero', match:'18235', nazionale:'ITA')

The property 'match' is unique (ID's of match) while there are several 'nome' with the same name. I want to merge all the nodes with the same 'nome' and create a collection of different 'match' like this

(nome:'Del Piero', match:[45343,18235], nazionale:'ITA')
(nome:'Messi', match:'65324', nazionale:'ARG')

I tried with apoc library too but nothing works. Any idea?


Solution

  • Can you try this query :

    MATCH (n:Giocatore)
    WITH n.nome AS nome, collect(n) AS node2Merge
    WITH node2Merge, extract(x IN node2Merge | x.match) AS matches
      CALL apoc.refactor.mergeNodes(node2Merge) YIELD node
      SET node.match = matches
    

    Here I'm using APOC to merge the nodes, but then I do a map transformation on the node list to have an array of match, and I set it on the merged node.

    I don't know if you have a lot of Giocatore nodes, so perhaps this query will do an OutOfMemory exception, so you will have to batch your query. You can for example replace the first line by MATCH (n:Giocatore) WHERE n.nome STARTS WITH 'A' and repeat it for each letter or you can also use the apoc.periodic.iterate procedure :

        CALL apoc.periodic.iterate(
           'MATCH (n:Giocatore) WITH n.nome AS nome, collect(n) AS node2Merge RETURN node2Merge, extract(x IN node2Merge | x.match) AS matches', 
           'CALL apoc.refactor.mergeNodes(node2Merge) YIELD node
            SET node.match = matches',
            {batchSize:1000,parallel:true,retries:3,iterateList:true}
        ) YIELD batches, total