Search code examples
java-streamgroupingby

Java8-stream complex groupingBy


I want to do groupingBy and create a map that return me <Integer, List<Long>> my model is as below

public class Partition {
    private String tableName;
    private int vendorKey;
    private int retailerKey;
    private int periodKey;
    private int amtKey;
    private int partitionId;
}
    
public class JobQueue {

    private List<Partition> partitions;
    private int transformId;
    private long processKey;
    private String configId;
   
}

I have written code using old java way and Want to rewritten below code in java-8 using groupingBy

Map<Integer, List<Long>> processKeyMap = new HashMap<>();
        queues.forEach(q -> q.getPartitions().forEach(p -> {
                    int key = p.getPartitionId();
                    if (processKeyMap.containsKey(key)) {
                        List<Long> processKeys = processKeyMap.get(key);
                        processKeys.add(q.getProcessKey());
                        processKeyMap.put(key, processKeys);
                    } else {
                        processKeyMap.put(p.getPartitionId(), new ArrayList<>(Collections.singletonList(q.getProcessKey())));
                    }
                }
        ));

Thank you in advance


Solution

  • First, your current solution wastes time creating new Lists and then replacing them in the map. You can do it like this.

    • first if the map doesn't contain the key (partitionId), then add it with an empty list.
    • then continue to the next statement and get the list for that key (it must be there since you either just added it or it was there before) and add the value (processKey) to that retrieved list.
    Map<Integer, List<Long>> processKeyMap = new HashMap<>();
    queues.forEach(q -> q.getPartitions().forEach(p -> {
        int key = p.getPartitionId();
        if (!processKeyMap.containsKey(key)) {
            processKeyMap.put(key, new ArrayList<>());
        }
        processKeyMap.get(key).add(q.getProcessKey());
    
    }));
    

    Using Java 8+ compute methods added to the Map interface, this does the same thing with a single statement. Although a streams/groupingBy solution is possible, the following is, imho, easier and a more straight forward solution.

    • computeIfAbsent - put the key and a value in the map if it is not there. It then either returns the just added value or the one already associated with that key. In both cases that is the list where the processKey is added.
    Map<Integer, List<Long>> processKeyMap = new HashMap<>();
    queues.forEach(q -> q.getPartitions()
            .forEach(p -> processKeyMap
                    .computeIfAbsent(p.getPartitionId(),
                            v -> new ArrayList<Long>())
                    .add(q.getProcessKey())));
    

    Streams solution using groupingBy. They may be a better way to do this with streams.

    • stream the JobQueue instances.
    • then stream each JQ's Partition instances, packaging selected info in a Map.entry(partitionId, processKey) as an interim key, value pair
    • then group them by the Entry key and map to the Entry value, prior to putting the processKeys in a list.
    Map<Integer, List<Long>> processKeyMap = queues.stream()
             .flatMap(q -> q.getPartitions().stream().map(
                     p -> Map.entry(p.getPartitionId(), q.getProcessKey())))
             .collect(Collectors.groupingBy(Entry::getKey, Collectors
                     .mapping(Entry::getValue, Collectors.toList())));