Search code examples
c++stringsortingsimilarityanagram

How would you sort strings such that the anagrams are close to each other in C++?


It'd be really to implement in Java, since you could use the Comparator and the built in methods to sort character arrays and compare strings like this:

public class AnagramComparator implements Comparator<String> {
 public String sortChars(String s) {
   char[] content = s.toCharArray();
   Arrays.sort(content);
   return new String(content);
 }

public int compare(String s1, String s2) {
   return sortChars(s1).compareTo(sortChars(s2));
 }
}

But I'm wondering how would one go about implementing this in C++? Coding the C++ equivalent of the built-in methods used in the above Java code is definitely one option. Is there any other intelligent way?


Solution

  • A two-level approach:

    Sort the the string s into a sorted string t, and add an entry to a map<string, vector<string> as m[t].push_back(s);. Then each entry of the map is a vector of mutually anagrammatic strings.

    You could implement the same logic in a single, flat multimap, but that would be far more expensive (since you'd have to sort the string each time); or you could make a comparator that lexicographically compares the sorted string first and the actual string second, but again that's very expensive.