Search code examples

Parallel computation on a Set of data in java 8?

I have a Set of data like this:

Set<CustomObject> testSet = [{id: a1, qty: 3}, 
                             {id: a2, qty: 9},
                             {id: a3, qty: 5},
                             {id: a4, qty: 8},
                             {id: a5, qty: 12},
                             {id: a200, qty: 7}];

The ids are grouped into 3 groups which can be found using the method:

//The getGroup method is implemented in the class CustomObject.
//I am using hazelcast map to store few id's that are inclusive, and
//one of the id that is in the request of the api is the current id.
public String getGroup(String id){
     HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
       return "currentId";
     }else if(id.equals(hazelcastInstance.getMap("idMap").get(id))){
       return "inclusive";
     } else {
       return "exclusive";

The testSet above is with huge data, and I want to perform sum of quantities of each object in the Set based on the grouping method above using Java.

I tried using streams but that doesn't allow me to use the getGroup method in the groupingBy method of Java 8 Streams.

Please guide me on how to efficiently sum the qty values based on groups with parallel processing.


  • Here is the code will give the inclusive and exclusive qty's sum as grouped.

    Map < Object, Integer > resultMap =
        .collect(Collectors.groupingBy(item - > {
                if (item.getId().equals(hazelcastInstance.getMap("idMap").get(id)) 
                        return "inclusive";
                        return "exclusive";

    Additionally when using parallelStream(), you may consider using ArrayList instead of HashSet for a better performance, please don't forget to measure it.