Search code examples
webpagerankweb-search

Pagerank: node pointing to itself


If I want to compute the PageRank value for a node that is pointing to itself and a dangling node, I drop the dangling node and the initial (and final) PageRank would be 1?


Solution

  • The original Page Rank algorithm doesn't allow self-loops. However there are some variations that either explicitly add self-loops or consider those present in the link structure.

    So here we have the complete Web (or the web we crawled) containing just two nodes. A has a self loop and another link to B. B has no links. This leads to 0 values in the final PR vector. MMDS book (Ullman) suggests a couple of options: (1) Drop Dead Ends recursively, or (2) Add taxation parameters. enter image description here In your example, we can delete second node. See Fig ii. Now we are left with only one node with self loop. Remember, the deleted node has NOT got 0 score yet. If, let's say, your implementation assumes a self-loop to be counted as an inlink, now A has a PR score of 1. B has one successor (A) which would contribute to it. A has two outlinks (your looping assumption, plus link to B). See Fig iii. So finally we get B's PR as 0.5

    Note that the sums of the PageRanks exceed 1, and they no longer represent the distribution of a random surfer. Yet they do represent decent estimates of the relative importance of the pages. *Ullman MMDS, page 172-173, page 9-10 in pdf.

    Book Reference : http://infolab.stanford.edu/~ullman/mmds/ch5.pdf