Search code examples
algorithmdictionarytrie

Find anagram of input on set of strings..?


Given a set of strings (large set), and an input string, you need to find all the anagrams of the input string efficiently. What data structure will you use. And using that, how will you find the anagrams?

Things that I have thought of are these:

  1. Using maps

    a) eliminate all words with more/less letters than the input.

    b) put the input characters in map

    c) Traverse the map for each string and see if all letters are present with their count.

  2. Using Tries

    a) Put all strings which have the right number of characters into a trie.

    b) traverse each branch and go deeper if the letter is contained in the input.

    c) if leaf reached the word is an anagram

Can anyone find a better solution?

Are there any problems that you find in the above approaches?


Solution

  • Build a frequency-map from each word and compare these maps.

    Pseudo code:

    class Word
    
      string word
      map<char, int> frequency
    
      Word(string w)
        word = w
        for char in word
          int count = frequency.get(char)
          if count == null
            count = 0
          count++
          frequency.put(char, count)
    
      boolean is_anagram_of(that)
        return this.frequency == that.frequency