Search code examples
algorithmtreeb-tree2-3-4-treeinsertion-order

Sequence of insertions when generating B-Tree / 2-3-4 tree


Does anyone know about how the sequence of insertions matter for 2-3-4 Trees? Or B-Trees?

It seems the formula for minimum height is logm(k+1) where m is the max no. of children and k is the number of keys

And the formula for max height is: logn((k+1)/2) where n is the min no. of children an internal node can have.

But what sequence of insertions actually get these results?! I don't know.

It has been suggested to minimise the height of the 2-3-4 tree, you would take the median of the linear sequence eg. 1,2,3,4,5,6,7,8 it being 4, and insert that, before rinse repeating for the sub lists either side of the median. Is this true? And if so, what sequence maximises the height?


Solution

  • Yes, the sequence of insertions matters. Obviously, the tree will be taller for the same number of keys if more nodes are 1-nodes. They way to maximise the number of 1-nodes in the tree is to continually expand one branch of the tree to 4-nodes, increasing the height of the tree while leaving many nodes as 1-nodes. Essentially, insert the keys pre-sorted. 1,2,3,..,k. For a minimum height tree, you want to expand in all branches evenly so as to fill up each layer of the tree. So, you insert the median of the keys, split up the insertion list at this key and then insert the medians from the two halves of the list and so on..