Search code examples
data-structurestreetrie

What does radix mean in a radix tree?


I was trying to understand the radix tree (or compact prefix tree) data structure.

I understand how lookup, insert and delete works with it. But I could not understand what does radix mean in a radix tree.

What is the purpose of radix here?


Solution

  • As already mentioned by @ta in the Wikipedia etymology link, the 'radix', is the the base of your trie. In this case we mean the numeric base, and we'll consider storing binary data. Radix R = 2 ^ x, where x >= 1.

    Take the example of a binary (2-ary) trie. The radix is 2, so at each node you can compare one bit. The two children will handle all of the possible outcomes:

    • the bit is 0
    • the bit is 1

    The next level of complexity would be a 4-ary trie. As @Garrett mentioned above, the radix must be a power of two so that it can always handle all possible sorting outcomes of the binary data we're using it for. A 4-ary trie can compare two binary bits with the four possible outcomes:

    • 00
    • 01
    • 10
    • 10

    These four options each lead to a different child node.

    Now, in answer to your question about the radix for the English alphabet. You want to encode letters from a to z (26 letters) so will need to have a radix of at least 2^5 = 32. This is the smallest radix that will let you switch between every letter and conform to the 'powers of two' rule. 2^4 = 16 wouldn't handle all of the letters.

    As an example, let's imagine the following encoding:

    • 00000 represents 'a',
    • 00001 represents 'b',
    • ... etc
    • 11001 represents 'z',
    • 11010 to 11111 are not in use (yet)

    we can now do a comparison of five bits at every node in the tree, so every node can now switch between any Roman-alphabet letter. If you want a trie that will handle upper case letters then you will require a larger radix. A radix of 2^6 will give you enough to do this, but it comes at the cost of more wasted space (unused branches) in the trie.

    Further reading: Sedgewick, Ch 15.4, on Multiway Tries. The 3rd edition of Algorithms by Cormen is generally excellent but doesn't do much for multiway tries.