Search code examples
neo4jpagerank

Neo4j: Testing PageRank 'None' results


I'm testing PageRank in a projected graph 'ns_reverse' where I've applied before Node Similarity. My dataset has initially two types of nodes 'Keywords' and 'Articles' that are linked by a relationship 'APPEARS_IN', like this:

Keyword-[APPEARS_IN]->Article

After applying Node Similarity, my projected graph has also a new relationship 'SIMILAR', like this:

Article-[SIMILAR]->Article

Now that I'm testing PageRank to mesure the importance of each 'Article' node, I getting 'None' for the nodes type 'Keyword' but I do not want 'Keywords' nodes to be mesured. Here is the code:

CALL gds.pageRank.stream('ns_reverse') 
YIELD nodeId, score 
RETURN gds.util.asNode(nodeId).title AS title,
       gds.util.asNode(nodeId).keyword AS keyword, score
ORDER BY score DESC, title ASC

I returned both, nodes title and nodes keywords to show you the 'None' results in both columns: Results

I only want to mesure the importance of 'Articles' nodes. What should I do?


Solution

  • If you want PR to be computed only for Article nodes, you need to create another projected graph containing only these nodes and the relationship between them (I guess it is the new SIMILAR relationship):

    CALL gds.graph.create("pgraph_article", "Article", "SIMILAR")
    

    and then use this new projected graph when computing PR:

    CALL gds.pageRank.stream("pgraph_article", {})
    YIELD nodeId, score 
    RETURN gds.util.asNode(nodeId).title AS title, score
    ORDER BY score DESC, title ASC