Search code examples
duplicatesbinary-search-tree

Why do we only put duplicate nodes of a binary search tree in either the right or left subtree, but not both?


A test question I had on an exam asked why do we only put duplicate nodes of the root of a binary search tree in either the right or left subtree, but not both, so why is just putting duplicates of the root anywhere bad, what is the good thing that happens from selecting either right or left?

The question didn't give elaboration or a diagram, this was a final exam and there must've been just an implicit assumption that this was a common concept or practice. I still got a B on this exam but I missed this one and still don't understand why you can't just put any value anywhere regardless of duplicates.


Solution

  • why we only put duplicate nodes of the root of a binary search tree in either the right or left subtree?

    I suppose you mean duplicate values, which get stored in separate nodes.

    The algorithm for storing a new value in a binary search tree needs to decide whether to go left or right. The choice is unique when the value to store is different from the root's value. When it's less, it should be stored in the left subtree, when it's greater, it should be stored in the right subtree.

    However, when it is equal, it becomes a matter of choice: both options are fine. So in practice an implementation will in that case always choose left, or will always choose right -- that's just the simplest way to implement it.

    why is just putting duplicates of the root anywhere bad

    No-one says that is bad, but consider the following:

    • if insertion has to be done sometimes in the left subtree, sometimes in the right one, then you need to introduce some additional algorithm to drive that choice, e.g. a random number generation, which is just going to slow down the process.

    • if ever you need to count the number of duplicates of a given value in the tree, then it will be easier when you know they are all on one side of the root. Otherwise you need to traverse in two directions to find them all. That is also possible, but leads to be bit more code.

    • Still, if the binary search tree has additional mechanics to keep it well balanced, like for example AVL or red-black trees have, then duplicate values can end up at either side of the root through rotation. For instance, if your binary search tree is an AVL tree, and the insertion algorithm is such that equal values are always inserted in the right subtree, then see what inserting the value 6 in the following tree does:

        5               5                   5     
         \   insertion   \     rotation    / \
          5   =======>    5    =======>   5   6
                           \       
                            6 (inserted)
      

      ... and so the duplicate value is suddenly at the left side.