Search code examples
javajava-8binary-operators

What is the Java 8 reduce BinaryOperator used for?


I am currently reading the O'Reilly Java 8 Lambdas, it is a really good book. I came across with a example like this.

I have a

private final BiFunction<StringBuilder,String,StringBuilder>accumulator=
(builder,name)->{if(builder.length()>0)builder.append(",");builder.append("Mister:").append(name);return builder;};

final Stream<String>stringStream = Stream.of("John Lennon","Paul Mccartney"
,"George Harrison","Ringo Starr");
final StringBuilder reduce = stringStream
    .filter(a->a!=null)
    .reduce(new StringBuilder(),accumulator,(left,right)->left.append(right));
 System.out.println(reduce);
 System.out.println(reduce.length());

this produce the right output.

Mister:John Lennon,Mister:Paul Mccartney,Mister:George Harrison,Mister:Ringo Starr

My question is regarding the reduce method the last parameter which is a BinaryOperator.

Which this parameter is used for? If I change by

.reduce(new StringBuilder(),accumulator,(left,right)->new StringBuilder());

the output is the same; if I pass NULL then N.P.E is returned.

What is this parameter used for?

Update

Why if I run it on parallelStream I am receiving different results?

First run:

returned StringBuilder length = 420

Second run:

returned StringBuilder length = 546

Third run:

returned StringBuilder length = 348

and so on. Why is this - should it not return all the values at each iteration?


Solution

  • The method reduce in the interface Stream is overloaded. The parameters for the method with three arguments are:

    • identity
    • accumulator
    • combiner

    The combiner supports parallel execution. Apparently, it is not used for sequential streams. However, there is no such guarantee. If you change your streams into parallel stream, I guess you will see a difference:

    Stream<String>stringStream = Stream.of(
        "John Lennon", "Paul Mccartney", "George Harrison", "Ringo Starr")
        .parallel();
    

    Here is an example of how the combiner can be used to transform a sequential reduction into a reduction, that supports parallel execution. There is a stream with four Strings and acc is used as an abbreviation for accumulator.apply. Then the result of the reduction can be computed as follows:

    acc(acc(acc(acc(identity, "one"), "two"), "three"), "four");
    

    With a compatible combiner, the above expression can be transformed into the following expression. Now it is possible to execute the two sub-expressions in different threads.

    combiner.apply(
        acc(acc(identity, "one"), "two"),
        acc(acc(identity, "three"), "four"));
    

    Regarding your second question, I use a simplified accumulator to explain the problem:

    BiFunction<StringBuilder,String,StringBuilder> accumulator =
        (builder,name) -> builder.append(name);
    

    According to the Javadoc for Stream::reduce, the accumulator has to be associative. In this case, that would imply, that the following two expressions return the same result:

    acc(acc(acc(identity, "one"), "two"), "three")  
    acc(acc(identity, "one"), acc(acc(identity, "two"), "three"))
    

    That's not true for the above accumulator. The problem is, that you are mutating the object referenced by identity. That's a bad idea for the reduce operation. Here are two alternative implementations which should work:

    // identity = ""
    BiFunction<String,String,String> accumulator = String::concat;
    
    // identity = null
    BiFunction<StringBuilder,String,StringBuilder> accumulator =
        (builder,name) -> builder == null
            ? new StringBulder(name) : builder.append(name);