Search code examples
javajava-streamcollectors

Using the Java Stream API to group values together based upon each item frequency as the key


I have been exposed to the pattern of using Java Stream API to group items by how often they occur in a collection via groupingBy and counting. For example,

Map<String,Long> counts = Arrays.stream(words)
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

I am wondering if there is an easy way to do the inverse of this. Defined more explicitly, I want to group items with the key being how often they occur and the value being a collection of the values. Essentially what this code does:

      Map<String,Long> counts = Arrays.stream(words)
        .collect(Collectors.groupingBy(
          Function.identity(),
          Collectors.counting())
      );
      
      Map<Long,List<String>> map = new HashMap<>();
      for(Map.Entry<String,Long> entry : counts.entrySet())
        map.computeIfAbsent(
          entry.getValue(),
          i -> new ArrayList<>()
        ).add(entry.getKey());

So the example input of words could be

["lorem","ipsum","lorem","lorem","dolor","dolor","sit"]

to produce an output of

{1:["ipsum","sit"],2:["dolor"],3:["lorem"]}

So far the closest I have been able to come with the Stream API is this monstrosity (there has to be a better way)

Map<Long,List<String>> map =
        Arrays.stream(words)
        .collect(
          Collectors.collectingAndThen(
            Collectors.groupingBy(
              Function.identity(),
              Collectors.counting()
            ),
           stringLongMap -> stringLongMap.entrySet().stream()
                        .collect(
                          Collectors.collectingAndThen(
                            Collectors.groupingBy(entry -> entry.getValue()),
                            longEntryMap -> longEntryMap.entrySet()
                                          .stream()
                                          .collect(
                                            Collectors.toMap(Map.Entry::getKey,
                                                             e -> e.getValue().stream()
                                                             .map(i -> i.getKey())
                                                             .collect(Collectors.toList())))))));

The above way is super roundabout and is impractical, unreadable, and otherwise terrible. I feel disgusting for even coming up with it. I was hoping there would be a way to do this that is similar to this example from the Collectors API page,

// Group employees by department
Map<Department, List<Employee>> byDept = employees.stream()
                    .collect(Collectors.groupingBy(Employee::getDepartment));

When I put Collectors.counting() inside of groupingBy, the compiler gets upset. Ultimately, it is this which is that for what I wish to group by. Is there a more elegant way with streams to get a Map<Long,List<String>> where the key corresponds to a frequency and the value corresponds to a collection of all items which have that frequency?

Thank you.


Solution

  • The easiest way is to do a frequency count of the words, then stream the entries of that map and reverse the key and value.

    String[] arr = { "lorem", "ipsum", "lorem", "lorem", "dolor",
            "dolor", "sit" };
    
    Map<Long, List<String>> freq = Arrays.stream(arr).collect(Collectors
            .groupingBy(str -> str, Collectors.counting())).entrySet()
            .stream()
            .collect(Collectors.groupingBy(Entry::getValue,
                    Collectors.mapping(Entry::getKey,
                            Collectors.toList())));
    
    freq.entrySet().forEach(System.out::println);
    

    prints

    1=[ipsum, sit]
    2=[dolor]
    3=[lorem]
    

    If you were to take your counts map from earlier then you just need to do this.

    Map<Long, List<String>> result = counts.entrySet().stream()
            .collect(Collectors.groupingBy(Entry::getValue,
                    Collectors.mapping(Entry::getKey,
                            Collectors.toList())));