Search code examples
neo4jcypherneo4j-apoccypher-3.1

how to calculate a distinct count for nodes


I need your help in neo4j project. I have two nodes Author and Article. The relationship between those is

(author:Author)-[:WRITES]->(article:Article)

An article could be written by more than one author. So i want to calculate which are the top 5 authors with the most collaborations (with different authors). Also, i want to return author names and number of collaborations. I tried the below but it didn't work.

MATCH (article:Article)<-[:WRITES]-(author:Author)
with article, collect(distinct author.name) as authors
RETURN authors,size(authors)-1 as numberofcollaborations
ORDER BY numberofcollaborations DESC
LIMIT 5; 

Any ideas?


Solution

  • You can use the path pattern to get contributors for each article, and then aggregate by author:

    MATCH (author:Author)-[:WRITES]->(article:Article)<-[:WRITES]-(coauthor:Author)
    WITH author, 
         size(collect(distinct coauthor)) as numberofcollaborations 
         ORDER BY numberofcollaborations 
         DESC LIMIT 5
    RETURN author.name as author, 
           numberofcollaborations