Search code examples
neo4jcypher

Removing duplicates from output of COLLECT clause


I have nodes of type Challenge, Entry, User, Comment

  • An Entry can be PART_OF a Challenge
  • A Challenge or Entry can be POSTED_BY a User
  • A Comment can be POSTED_BY a User
  • A Comment can be POSTED_IN a Challenge
  • A User can LIKE a Challenge or an Entry

I'm trying to query all Entries that are PART_OF a Challenge that has been either LIKED or Commented by a specific user, or has an Entry posted by that same user (one or more of these conditions).

MATCH (u:User {id: 'r1tcX0vxW'})-[:LIKES]->(c:Challenge) 
WITH COLLECT (c) as likedChallenges
MATCH (c:Challenge)<-[:POSTED_IN]-(comment:Comment)-[:POSTED_BY]->(u) 
WITH likedChallenges, COLLECT (c) as commentedChallenges 
MATCH (c:Challenge)<-[:PART_OF]-(e:Entry)-[:POSTED_BY]->(u) 
WITH likedChallenges + commentedChallenges + COLLECT (c) AS allChallenges 
UNWIND allChallenges as c 
MATCH (e:Entry)-[:PART_OF]->(c) 
RETURN e;

I'm using COLLECT clauses to make this work, but the problem is there are duplicate output of Challenge nodes, and I'm not sure how to remove the duplicates.


Solution

  • You can use DISTINCT in the last clause to return distinct entries:

    RETURN DISTINCT e;
    

    Note: you have numerous syntax errors in your Cypher query. This should work better:

    MATCH (u:User {id: 'r1tcX0vxW'})-[:LIKES]->(c:Challenge)
    WITH u, COLLECT(c) as all
    MATCH (c:Challenge)<-[:POSTED_IN]-(:Comment)-[:POSTED_BY]->(u) 
    WITH u, all + COLLECT(c) AS all
    MATCH (c:Challenge)<-[:PART_OF]-(:Entry)-[:POSTED_BY]->(u) 
    WITH all + COLLECT(c) AS all
    UNWIND all AS c
    MATCH (e:Entry)-[:PART_OF]->(c)
    RETURN DISTINCT e;