Search code examples
algorithmdata-structuresdisjoint-setsunion-finddisjoint-union

Disjoint-set forests - why should the rank be increased by one when the find of two nodes are of same rank?


I am implementing the disjoint-set datastructure to do union find. I came across the following statement in Wikipedia:

... whenever two trees of the same rank r are united, the rank of the result is r+1.

Why should the rank of the joined tree be increased by only one when the trees are of the same rank? What happens if I simply add the two ranks (i.e. 2*r)?


Solution

  • Because in this case - you add one tree is a "sub tree" of the other - which makes the original subtree increase its size.

    Have a look at the following example:

    1           3
    |           |
    2           4
    

    In the above, the "rank" of each tree is 2.
    Now, let's say 1 is going to be the new unified root, you will get the following tree:

        1
       / \
      /   \
     3     2
     |
     4
    

    after the join the rank of "1" is 3, rank_old(1) + 1 - as expected.1

    As for your second question, because it will yield false height for the trees.

    If we take the above example, and merge the trees to get the tree of rank 3. What would happen if we then want to merge it with this tree2:

             9
            / \
          10   11
                |
               13
                |
               14
    

    We'll find out both ranks are 4, and try to merge them the same way we did before, without favoring the 'shorter' tree - which will result in trees with higher height, and ultimately - worse time complexity.


    (1) Disclaimer: The first part of this answer is taken from my answer to a similar question (though not identical due to your last part of the question)

    (2) Note that the above tree is syntatically made, it cannot be created in an optimized disjoint forests algorithms, but it still demonstrates the issues needed for the answer.