Search code examples
neo4jcyphersequencefrequencyneo4j-apoc

Ignoring a property within Cypher Query OR alternative: how count relationship sequences


the following Cypher statements give me the graph output (see image) below the code. And also the text output below the image. My problem is the text output.

I will try to explain the problem clearly: I am interested in the same sequences of two nodes, with respect to a specific property (here: personName). E.g. as you can see in the picture (or from the create statement) Bob comes after May twice. I wanted to capture this via apoc.coll.frequencies(pairsOfActs) AS giveBackFrequencyOfPairs RETURN giveBackFrequencyOfPairs. However, the 'time' property is in the way. Is there a way to ignore the time property? I have been trying with operations on lists and also with deleting the time property (then my sequence is gone), but nothing is working. Any suggestions? Or is this approach wrong entirely, or is there even a predefined function for counting specific node sequences that I am missing?

CREATE
    (a: Action {personName: 'Tom', time: 1}), 
    (b: Action {personName: 'May', time: 2}),  
    (c: Action {personName: 'Bob', time: 3}),
    (d: Action {personName: 'Alex', time: 4}), 
    (e: Action {personName: 'Zac', time: 5}),
    (f: Action {personName: 'Jill', time: 6}),
    (g: Action {personName: 'May', time: 7}),  
    (h: Action {personName: 'Bob', time: 8})


MATCH (act: Action) 
WITH act  ORDER BY act.time ASC 
WITH COLLECT(act) AS acts 
FOREACH (n IN RANGE(0, size(acts)-2) |
FOREACH (curr IN [acts[n]] | 
FOREACH (next IN [acts[n+1]] | 
MERGE (curr)-[:NEXT]-> (next)))) 
WITH apoc.coll.pairsMin(acts) as pairsOfActs
UNWIND pairsOfActs as unwoundPairsOfActs
WITH apoc.coll.frequencies(unwoundPairsOfActs) AS    giveBackFrequencyOfPairs
RETURN giveBackFrequencyOfPairs

enter image description here enter image description here


Solution

  • For your stated problem, there is no need to create the NEXT relationships, so this answer does not bother to create them. You can modify this answer to add that back in, if that is actually needed for some reason.

    This query should return the frequency of each pair of names (that appear in your time sequence):

    MATCH (act: Action) 
    WITH act ORDER BY act.time ASC 
    RETURN apoc.coll.frequencies(apoc.coll.pairsMin(COLLECT(act.personName))) AS giveBackFrequencyOfPairs
    

    The output, with your sample data, would be:

    ╒══════════════════════════════════════════════════════════════════════╕
    │"giveBackFrequencyOfPairs"                                            │
    ╞══════════════════════════════════════════════════════════════════════╡
    │[{"count":1,"item":["Tom","May"]},{"count":2,"item":["May","Bob"]},{"c│
    │ount":1,"item":["Bob","Alex"]},{"count":1,"item":["Alex","Zac"]},{"cou│
    │nt":1,"item":["Zac","Jill"]},{"count":1,"item":["Jill","May"]}]       │
    └──────────────────────────────────────────────────────────────────────┘