Search code examples
algorithmdata-structuresbinary-treedisjoint-setsbinomial-heap

Disjoint sets data structures and binomial trees?


Can someone either explain what Disjoint Sets Data Structure is?, or alternatively link me to a YouTube video or article that explains it well.

I searched for it a few minutes ago and all I got were some Math lessons that involved an image that looked like a venn diagram. Maybe that is it, but I am not sure, so any help is appreciated.

On a quick side note, when I am asked "How to use binary tree to represent each binomial tree in a binomial queue" Is this referring to the binomial trees that you have to stack upon one another. Like B1 attaches with a B1 to become a B2, then two B2's become a B3, and so on and so forth.


Solution

  • Disjoint set data structures are data structures for representing a partition of a set S. You begin with a set S of elements, each of which belongs to its own group. For example:

    {1} {2} {3} {4} {5} {6}
    

    One operation on a disjoint-set data structure is the union operation, which combines together the two sets containing the given elements. For example, unioning together 1 and 2 gives back the partition

    {1, 2} {3} {4} {5} {6}
    

    Unioning together 3 and 5 produces

    {1, 2}, {3, 5}, {4}, {6}
    

    Now, unioning together 1 and 3 produces the partition

    {1, 2, 3, 5}, {4}, {6}
    

    The find operation tells you which set a given element belongs to. Typically, this is done by having find return a representative element of the element it belongs to. This is usually done such that

    find(x) == find(y)  if and only if  x and y are in the same set.
    

    For example, find(1) might return 2, and so find(2) = 2, find(3) = 2, find(5) = 2.

    Disjoint set data structures are often used as a subroutine in Kruskal's minimum spanning tree algorithm, as they provide a very fast way of checking whether two nodes in the graph are connected and an easy way of marking that all nodes in two connected components are connected to one another when an edge is added. Using the disjoint-set forest implementation with union-by-rank and path compression, n operations on a disjoint-set forest can be done in O(n α(n)) time, where α(n) is the inverse Ackermann function, a function that grows so slowly it's effectively a constant (it's at most four for any input less than the size of the universe.)


    As for binomial trees and binary trees: I think what you are asking about is how to represent binomial trees, which are many-way trees, using binary trees, which have at most two children. Not all binomial trees are binary trees, so a suitable encoding must be used.

    One way to do this is using something called the left-child right-sibling representation. This represents a many-way tree as a binary tree according to the following setup:

    • The left child of each node points to the node's first child.
    • The right child of each node points to its next sibling (node in the same layer with the same parent).

    For example, given this binomial tree:

         a
       / | \
      b  c  d
     /|  |
    e f  g
      |
      h
    

    The left-child right-sibling representation would be

                     a
                    /
                   b
                /    \
               e      c
                \    / \
                 f  g   d
                /
               h   
    

    By the way - if you do this on binomial trees, you end up with a representation of a binomial tree as something called a half-ordered half-tree, which is a binary tree with the following properties:

    • Every node in the tree is greater than or equal to (or less than or equal to, depending on whether this is a min-heap or a max-heap) every node in its left subtree.
    • The root node has no right child.

    These definitions follow from the fact that a binomial tree is heap-ordered and then converted into a left-child right-sibling representation. Using this representation, it is extremely fast to link together to binomial trees. I'll leave that as an exercise to the reader. :-)

    Hope this helps!