Search code examples
javaalgorithmdata-structureshashmaptrie

Implementing a TRIE with a nested hash map?


What would be the benefit to make a TRIE with a nested hash map?

For example, let's have a nested hash map where each map has keys of just one character. So we would have something like myHashMap['d']['o']['g']['*'] = True, for the word 'Dog'. The '*' at the end means the end of an entry.

In books, I never saw this approach but rather the "classic" Node class. Why?


Solution

  • I use

    Map<Character, TrieMap<K, V>> children = new TreeMap<>();
    

    for my implementation of a TrieMap. It works very nicely.

    The benefits of using the normal Node structure is that you can wrap links to the parent map into the structure so you can iterate the map more easily. I did not take that route and build a Stack while iterating because I wanted to ensure I did not bloat the structure with unnecessary content. I then build the stack while iterating.

    The primary benefit of a Trie is it's space-saving features when the keys are similar - in my opinion it is foolish to then add unnecessary weight to the structure. Thus my decision to only use the TreeMap. The other alternative would have been an Array or List but to me neither of those are as space efficient as a TreeMap when the data is patterned well for a Trie.

    Actually - the code looks more like:

    /**
     * Map each character to a sub-trie.
     *
     * Could replace this with a 256 entry array of Tries but this will handle multi-byte character sets and I can discard
     * empty maps.
     *
     * Maintained at null until needed (for better memory footprint).
     *
     */
    private Map<Character, TrieMap<K, V>> children = null;
    
    ....
    
    /**
     * Create a new set of children.
     *
     * I've always wanted to name a method something like this.
     */
    private void makeChildren() {
      if (children == null) {
        // Use a TreeMap to ensure sorted iteration.
        children = new TreeMap<>();
      }
    }
    

    so I further reduce the memory footprint by ensuring that a childless node does not hold a wasteful empty Map (although I could just as easily have used Collections.emptyMap()).