Search code examples
javacollectionshashmaptreemap

How to sort TreeMap values in descending order and how to limit the output?


this is my third day with Java (beginner coder in general) and I am finding trouble with getting the desired output I need. I am trying to find the frequency of words occurring in a string or text file. My whole program works so far except I am having difficulty with outputting the result from most frequent words to less; furthermore how can I limit it to the top x most used words for example.

Here is my code so far:

    public static void wordOccurrence(String text) {

    String[] wordSplit = text.split(" ");

    for (int i = 0; i < wordSplit.length; i++) {
        Map<String, Integer> occurrence = new TreeMap<>(Collections.reverseOrder());
        int Counter = 0;
        for (int j = 0; j < wordSplit.length; j++) {
            if (wordSplit[i].equals(wordSplit[j])) {
                if (j < i)
                    break;
                Counter++;
                occurrence.put(wordSplit[j],Counter);
            }
        }
        if (Counter > 1)
            System.out.println(occurrence);
    }
}

and here is my output which is unordered:{The=2}{that=2}{to=2}{and=5}{for=2}{as=2}


Solution

  • You are using TreeMap to sort your entries. TreeMap sorts entries by key, not value.

    You can use streams and LinkedHashMap for that job:

    public static void wordOccurrence(String text) {
        String[] wordSplit = text.split(" ");
    
        Map<String, Long> map = Arrays.stream(wordSplit)
            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
    
        List<Entry<String, Long>> list = new ArrayList<>(map.entrySet());
        list.sort(Entry.comparingByValue(Comparator.reverseOrder()));
    
        Map<String, Long> occurrence = list.stream()
            .collect(Collectors.toMap(Entry::getKey, Entry::getValue, (s1, s2) -> s1, LinkedHashMap::new));
    
        occurrence.entrySet().forEach(entry -> System.out.println(entry.getKey()+";"+entry.getValue()));
    
    }
    

    Or whithout using List:

    public static void wordOccurrence(String text) {
    
        String[] wordSplit = text.split(" ");
    
        Map<String, Long> map = Arrays.stream(wordSplit)
                .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
    
        Map<String, Long> occurrence = map.entrySet().stream()
                .sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))
                .collect(Collectors.toMap(Entry::getKey, Entry::getValue, (s1, s2) -> s1, LinkedHashMap::new));
    
        occurrence.entrySet().forEach(entry -> System.out.println(entry.getKey()+";"+entry.getValue()));
            
    }
    

    If you just want the top "n" you can add a line with .limit(n):

    Map<String, Long> occurrence = map.entrySet().stream()
            .sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))
            .limit(5)
            .collect(Collectors.toMap(Entry::getKey, Entry::getValue, (s1, s2) -> s1, LinkedHashMap::new));