Search code examples
algorithmrecursiondynamic-programmingrecursive-backtracking

Explain the use of HashMap: Write a method to compute all permutations of a string whose characters are NOT necessarily unique


I came across the question below, don't fully understand the usage of HashMap, including the lines of map.put(c, count - 1) and map.put(c, count)?

Anyone can explain?

Permutations with Duplicates: Write a method to compute all permutations of a string whose characters are not necessarily unique. The list of permutations should not have duplicates.

public static HashMap<Character, Integer> getFreqTable(String s) {
        HashMap<Character, Integer> map = new HashMap<Character, Integer>();
        for (char c : s.toCharArray()) {
            if (!map.containsKey(c)) {
                map.put(c, 0);
            }
            map.put(c, map.get(c) + 1);
        }
        return map;
    }
    
    public static void getPerms(HashMap<Character, Integer> map, String prefix, int remaining, ArrayList<String> result) {
        if (remaining == 0) {
            result.add(prefix);
            return;
        }
        
        for (Character c : map.keySet()) {
            int count = map.get(c);
            if (count > 0) {
                map.put(c,  count - 1);
                printPerms(map, prefix + c, remaining - 1, result);
                map.put(c,  count);
            }
        }
    }
    
    public static ArrayList<String> getPerms(String s) {
        ArrayList<String> result = new ArrayList<String>();
        HashMap<Character, Integer> map = getFreqTable(s);
        getPerms(map, "", s.length(), result);
        return result;
    }
    
    public static void main(String[] args) {
        String s = "aab";
        ArrayList<String> result = getPerms(s);     
        System.out.println(result.toString());
    }

Update Thansk @trincot for his answer.

Sorry for not making it clear. I understand the use of HashMap, but I was looking for the reasoning for using it for this permutation question, particularly with duplicate numbers in the input.

For example, the reasoning why using HashMap and recursive backtracking can resolve this issue. I debugged and traced the getPerms but I cannot understand the backtracking logic naturally. The backtracking controls whether or not some permutation can be generated. But I cannot come up with it if I do it myself.

Below is the trace of first part of getPerms. X means if is not executed because a or b is zero.

aab -> aab,aba,baa
a2 b1  

"" 3   
  a:2
    a:1, 
     p(a,2)
        a:0
           p(aa,1)
           a: X aaa
           b: b=0
              p(aab,0)
                re: aab
              b=1
          a=1
        b:1
         b=0
          p(ab,1)
            a:0
              a=0
               p(aba,0)
                a:1
            b:0
             X abb
      a=2
   b:1

Update 2

below is another example that explains why using HashMap helps

without HashMap

   ab
    [aa, ab, ba, bb]
    
    ab
     a
      a b
       aa  
       bb
     b 
      b a 
       ba
       bb

with HashMap

ab   
[ab, ba] 

This tells using HashMap and backtracking avoid duplicate in the input


Solution

  • The getFreqTable will create a HashMap that has as keys the characters of the input, and as values the count of occurrence of the corresponding character. So for input "aacbac", this function returns a HashMap that can be described as follows:

    "a": 3 
    "b": 1
    "c": 2 
    

    This is a very common use of a HashMap. As it provides quick lookup of a key (of a character in this case) and quick insertion of a new key, it is the ideal solution for counting the occurrence of each character in the input.

    Then this map is used to select characters for a permutation. Whenever a character is selected for use in a permutation, its counter is decreased:

    map.put(c,  count - 1);
    

    And when backtracking from recursion (which will produce all permutations with that character c as prefix), that counter is restored again:

    map.put(c,  count);
    

    Whenever the counter for a certain character is 0, it cannot be selected anymore for a permutation. This is why there is this condition:

    if (count > 0)
    

    I hope this explains it.

    Addendum

    By maintaining the count of duplicate letters, this algorithm avoids to make an artificial distinction between these duplicate letters, by which the permutation "aa" would be considered the same as "aa", just because those two letters were swapped with eachother. A decrement of a counter does not mind where exactly that duplicate came from (position in the input). It just "takes one of them, no matter which".