I'm writing a program that analyse posts in a forum.
After loading forum threads into neo4j DB,
I'm trying to "Rank" posts by the number of responses they got.
Responses include direct responses as well as the entire sub-tree for each direct response.
The idea is to count all children down the tree (the tree is a simple tree without any loops)
Every post is a neo4j node
# Create MSG nodes: statement = "CREATE (c:MSG {id:{N}, title:{T}}) RETURN c" for msg in msgs: graph.cypher.execute(statement, {"N": msg[0], "T": msg[1]})
Node that represent a post which is a response to another post has a relation r:CHILD_OF to his parent node.
root nodes will not have r:CHILD_OF relation, but will have a "0" as their parent ID
|parent id | msg id | Rank | List of all responses
+----------+--------+------+----------------------
|0 | 1051 | 3 | (1054, 1056, 1060)
|1051 | 1054 | 0 |
|1051 | 1056 | 1 | (1060)
|1056 | 1060 | 0 |
|0 | 1052 | 0 |
in this table,
I need to get the cypher that can create this ranking.
But not sure how to write it.
The project is in python and I'm using python 2.7, py2neo 2.0.3, neo4j 2.1.6
This query should return a result set similar to your table (but without the first column):
MATCH (m:MSG)
OPTIONAL MATCH (c:MSG)-[:CHILD_OF*1..]->(m)
WITH m, COLLECT(DISTINCT c.id) AS childMsgIds
RETURN m.id AS `msg id`, LENGTH(childMsgIds) AS Rank, childMsgIds AS `List of all responses`
Does this suit your needs?