Search code examples
dictionarymergejava-8collectors

How to use method reference in Java 8 for Map merge?


I have following 2 forms of calling a collect operation, both return same result, but I still cannot depend fully on method references and need a lambda.

<R> R collect(Supplier<R> supplier,
          BiConsumer<R,? super T> accumulator,
          BiConsumer<R,R> combiner)

For this consider the following stream consisting on 100 random numbers

List<Double> dataList = new Random().doubles().limit(100).boxed()
            .collect(Collectors.toList());

1) Following example uses pure lambdas

Map<Boolean, Integer> partition = dataList.stream()
            .collect(() -> new ConcurrentHashMap<Boolean, Integer>(),
(map, x) ->
{
    map.merge(x < 0.5 ? Boolean.TRUE : Boolean.FALSE, 1, Integer::sum);
}, (map, map2) ->
{
    map2.putAll(map);
});

2) Following tries to use method references but 2nd argument still requires a lambda

Map<Boolean, Integer> partition2 = dataList.stream()
            .collect(ConcurrentHashMap<Boolean, Integer>::new, 
(map, x) ->
{
    map.merge(x < 0.5 ? Boolean.TRUE : Boolean.FALSE, 1, Integer::sum);
}, Map::putAll);

How can I rewrite 2nd argument of collect method in java 8 to use method reference instead of a lambda for this example?

System.out.println(partition.toString());
System.out.println(partition2.toString());
{false=55, true=45}
{false=55, true=45}

Solution

  • A method reference is a handy tool if you have an existing method doing exactly the intended thing. If you need adaptations or additional operations, there is no special syntax for method references to support that, except, when you consider lambda expressions to be that syntax.

    Of course, you can create a new method in your class doing the desired thing and create a method reference to it and that’s the right way to go when the complexity of the code raises, as then, it will get a meaningful name and become testable. But for simple code snippets, you can use lambda expressions, which are just a simpler syntax for the same result. Technically, there is no difference, except that the compiler generated method holding the lambda expression body will be marked as “synthetic”.

    In your example, you can’t even use Map::putAll as merge function, as that would overwrite all existing mappings of the first map instead of merging the values.

    A correct implementation would look like

    Map<Boolean, Integer> partition2 = dataList.stream()
        .collect(HashMap::new, 
                 (map, x) -> map.merge(x < 0.5, 1, Integer::sum),
                 (m1, m2) -> m2.forEach((k, v) -> m1.merge(k, v, Integer::sum)));
    

    but you don’t need to implement it by yourself. There are appropriate built-in collectors already offered in the Collectors class:

    Map<Boolean, Long> partition2 = dataList.stream()
        .collect(Collectors.partitioningBy(x -> x < 0.5, Collectors.counting()));