Search code examples
neo4jcypherfilteringprocessing-efficiencycypher-3.1

Cypher query whether a node is connected to multiple nodes in a filtered group


I am experimenting with a graph representing (:Shopper)'s who -[:Make]->(:Purchase)'s and each purchase -[:Contains]->(:Item)'s. The challenge is that I want to compare the quantity of Item A each Shopper bought on their most recent purchase. Eliminating Items with only one :Contains relationship won't work, because the Item may have been bought in an earlier purchase as well.

I can get data on the set of all Items in all Shoppers' most recent Purchases with

MATCH (s:Shopper)-->(p:Purchase)
WITH s, max(p.Time) AS latest
MATCH (s)-->(p:Purchase)
WHERE p.Time = latest
MATCH (p)-[c:Contains]->(i:Item)
RETURN s.Name, p.Time, c.Quantity, i.Name

but now I want to replace the second MATCH clause with something like

MATCH (p:Purchase)-[c1:Contains]->(i:Item)<-[c2:Contains]-(p:Purchase)

and it doesn't return any results. I suspect that this looks for items that have two :Contains relationships to the SAME Purchase. I want to get the :Contains relationships on two DIFFERENT Purchases in the same filtered group. How can I do this efficiently? I really want to avoid having to redo the filtering process on the second Purchase node.


Solution

  • [UPDATED]

    In your top query, you do not need to MATCH twice to get the latest Purchase for each Shopper (see below).

    In your MATCH snippet, you are using the same p variable for both Purchase nodes, so of course they are forced to be the same node.

    Here is a query that should return a set of data for each Item that was in the latest Purchases of multiple Shoppers:

    MATCH (s:Shopper)-[:Make]->(pur:Purchase)
    WITH s, pur
    ORDER BY pur.Time DESC
    WITH s, HEAD(COLLECT(pur)) AS p
    MATCH (p)-[c:Contains]->(i:Item)
    WITH i, COLLECT({shopper: s.Name, time: p.Time, quantity: c.Quantity}) AS set
    WHERE SIZE(set) > 1
    RETURN i.Name AS item, set;
    

    Here is a console that demonstrates the query with your sample data (with corrections to label and type names). It produces this result:

    +-------------------------------------------------------------------------------------------------------------------------------+
    | item     | set                                                                                                                |
    +-------------------------------------------------------------------------------------------------------------------------------+
    | "Banana" | [{shopper=Mandy, time=213, quantity=12},{shopper=Joe, time=431, quantity=5},{shopper=Steve, time=320, quantity=1}] |
    +-------------------------------------------------------------------------------------------------------------------------------+