Search code examples
algorithmrankred-black-tree

alternative rank function RBTree (red black tree)


I have an order-statistic augmented red black tree.

it works for the most part. but i need to implement a fast function (O(lg n)) that mostly returns the place of a node in sorted order. like the OS-rank function from my textbook. but with one twist: the return value if two nodes have the same score, should be the same. here is the os-rank function (in pseudocode, for a given node x, where root is the root of the tree).

OS-Rank(x)
r=x.left.size+1
y=x
while y!=root
  if y==y.p.right
    r+=y.p.left.size+1
y=y.p
return r

But: what i need is something where if A has key 1 and Node B has key 1, the function returns 1 for both. and so on. I tried myself with something like this.

rank(x)
start with value r=1
check that x.right is not Nil
  case x.right has the same key as x
    add x.right.#nodeswithkeyhigher(x.key) to r
  other cases: add x.right.size to r
y=x
while y != root
  if y.parent.left == y
    case y.parent.right.key>x.key
      add y.parent.right to r
    other cases
      add y.parent.right.#nodeswithkeyhigher(x.key) to r
  y=y.parent
return r

Guess what: a testcase failed. I'd like to know if this is a correct way of doing things, or if perhaps i made some mistake i am not seeing (else the mistake is in the Node.#nodeswithkeyhigher(key) function).


Solution

  • edit: final paragraph for answer, thanks to Sticky.

    tl;dr: skip to last paragraphs

    This is the same issue I'm having trouble with. (Yes DS aswell). So far all runs except 5 are correct. I've tested several things, one being a very simple one: Just exchange left and right in OSRank. In some cases it gave a correct answer but in the harder cases it was quite a bit off. Oh I also added that if y.score == y.parent.score I only add the right size of y.parent, if not I add the right size + 1.

    public int OSRank(Node x)
        {
            int r = x.Right.Size + 1;
            Node y = x;
            while (y != root)
            {
                if (y == y.Parent.Left)
                {
                    if (y.Score == y.Parent.Score)
                        r = r + y.Parent.Right.Size;
                    else
                        r = r + y.Parent.Right.Size + 1;
                }
                y = y.Parent;
            }
            return r;
        }
    

    Let's first test this method on the tree on page 340 (figure 14.1). We'll search for the rank of 38 (which should return 4 because 39, 47 and 41 are higher):

    1. r = 1 + 1 = 2 //Right side + 1
    2. r = 2 //nothing happens because we're a right child
    3. r = r + 1 + 1 = 4 //we're a left child, the key of our parent is larger and parent.Right.size = 1
    4. r = 4 //nothing happens because we're a right child

    So in this case the result is correct. But what if we add another node with key 38 to our tree. That reshapes our tree a bit, the right part of node 26 now looks like:

    (I'm not allowed to add images yet so look here:http://i47.tinypic.com/358ynhh.png)

    If we would use the same algorithm we'd get the following result (picking the red one):

    1. r = 0 + 1 = 1 //no right side
    2. r = 1 //we're a right child
    3. r = 1 //we're a right child
    4. r = 1 + 3 + 1 = 5 //The 3 comes from the size of node 41.
    5. r = 5 //we're a right child

    Though we expect rank 4 here. While I was typing this out I noticed that we check if y.Score == y.Parent.Score, but I completely forgot y changes. So in line 4 the clause "y.Score == y.Parent.Score" was false because we compared node 30 with 38. So if we change that line to:

    if (x.Score == y.Parent.Score)
    

    The algorithm outputs rank 4, which is correct. This means we eliminated another issue. But there are more, which I didn't figure out either:

    • The case in which Y.Parent.Right contains duplicate keys. Technically if we have 3 nodes with the same key, they should count as 1.
    • The case in which Y.Parent.Right contains keys that are equal to x.Key (the node you want the rank of). That would put us a few ranks back, incorrectly.

    I suppose you could keep another integer which holds the amount of nodes with a higher score. Upon insertion you could climb the tree and adjust values if the subtree of that node doesn't contain a node with the same score. But how this is done (and efficiently) is unknown to me right now.

    edit: First find the final successor of x with the same score x. Then calculate the rank the normal way. The code above works.