Search code examples
algorithmdata-structuresb-tree

How to decide order of a B-tree


B trees are said to be particularly useful in case of huge amount of data that cannot fit in main memory.

My question is then how do we decide order of B tree or how many keys to store in a node ? Or how many children a node should have ?

I came across that everywhere people are using 4/5 keys per node. How does it solve the huge data and disk read problem ?


Solution

  • Typically, you'd choose the order so that the resulting node is as large as possible while still fitting into the block device page size. If you're trying to build a B-tree for an on-disk database, you'd probably pick the order such that each node fits into a single disk page, thereby minimizing the number of disk reads and writes necessary to perform each operation. If you wanted to build an in-memory B-tree, you'd likely pick either the L2 or L3 cache line sizes as your target and try to fit as many keys as possible into a node without exceeding that size. In either case, you'd have to look up the specs to determine what size to use.

    Of course, you could always just experiment and try to determine this empirically as well. :-)

    Hope this helps!