Search code examples
javahashmapfrequency

Taking 10 Strings with highest values from hashMap


I want to save all words from titles from a site to a file. Then I want to take 10 most frequent words and save them to the other file. So I've got saving to the file. But I've stucked on looking for those 10 words. My code is only looking for 1 most frequent word and that's it. There're for sure better ways to do that than the one I've done. I'd be really grateful if you show me some tips. I've made through the most popular topics here, but all of them are about looking for the one most frequent word.

List<String> mostRepeatedWords = new ArrayList<>();
Set<Map.Entry<String, Integer>> entrySet = wordsMap.entrySet();
int max = 0;
for (int i = 0; i < entrySet.size(); i++) {
    for (Map.Entry<String, Integer> entry : entrySet) {   //here I'm looking for the word with the highest value in the map
        if (entry.getValue() > max) {
            max = entry.getValue();
            }
     }
     for (Object o : wordsMap.keySet()) {     //here I write this word to a list
         if (wordsMap.get(o).equals(max)) {
             mostRepeatedWords.add(o.toString());
         }
    }
}

@Edit Here's how I've counted the words:

while (currentLine != null) {
    String[] words = currentLine.toLowerCase().split(" ");

    for (String word : words) {
        if (!wordsMap.containsKey(word) && word.length() > 3) {
            wordsMap.put(word, 1);
        } else if (word.length() > 3) {
            int value = wordsMap.get(word);
            value++;
            wordsMap.replace(word, value);
        }
    }
    currentLine = reader.readLine();
}

Solution

  • Does this do it for you?

    First, sort the words (i.e. keys) of the map based on the frequency of occurrence in reverse order.

    List<String> words = mapOfWords.entrySet().stream()
            .sorted(Entry.comparingByValue(Comparator.reverseOrder()))
            .limit(10)
            .map(Entry::getKey)
            .collect(Collectors.toList());
    

    Then use those keys to print the first 10 words in decreasing frequency.

    for (String word : words) {
        System.out.println(word + " " + mapOfWords.get(word));
    }
    

    Another more traditional approach not using streams is the following:

    Test data

    Map<String, Integer> mapOfWords =
            Map.of("A", 10, "B", 3, "C", 8, "D", 9);
    

    Create a list of map entries

    List<Entry<String, Integer>> mapEntries =
            new ArrayList<>(mapOfWords.entrySet());
    

    define a Comparator to sort the entries based on the frequency

    Comparator<Entry<String, Integer>> comp = new Comparator<>() {
        @Override
        public int compare(Entry<String, Integer> e1,
                Entry<String, Integer> e2) {
                Objects.requireNonNull(e1);
                Objects.requireNonNull(e2);
            // notice e2 and e1 order is reversed to sort in descending order.
            return Integer.compare(e2.getValue(), e1.getValue());
        }
    };
    

    The above does the equivalent of the following which is defined in the Map.Entry class

    Comparator<Entry<String,Integer>> comp =
       Entry.comparingByValue(Comparator.reverseOrder());
    

    Now sort the list with either comparator.

    mapEntries.sort(comp);
    

    Now just print the list of entries. If there are more than 10 you will need to put in a limiting counter or use a mapEntries.subList(0, 10) as the target of the for loop.

    for (Entry<?,?> e : mapEntries) {
         System.out.println(e);
    }