java.util.stream.Collectors: Why is the summingInt implemented with an array?

The standard Collector summingInt internally creates an array of length one:

public static <T> Collector<T, ?, Integer>
summingInt(ToIntFunction<? super T> mapper) {
    return new CollectorImpl<>(
            () -> new int[1],
            (a, t) -> { a[0] += mapper.applyAsInt(t); },
            (a, b) -> { a[0] += b[0]; return a; },
            a -> a[0], CH_NOID);
}

I was wondering if it isn't possible to just define:

private <T> Collector<T, Integer, Integer> summingInt(ToIntFunction<? super T> mapper) {
    return Collector.of(
            () -> 0,
            (a, t) -> a += mapper.applyAsInt(t),
            (a, b) -> a += b,
            a -> a
    );
}

This however doesn't work since the accumulator just seems to be ignored. Can anyone explain this behaviour?

Solution

An Integer is immutable, while an Integer[] array is mutable. An accumulator is supposed to be stateful.

Imagine you've got 2 references to 2 Integer objects.

Integer a = 1;
Integer b = 2;

By nature, the instances you are referring to are immutable: you can't modify them once they have been created.

Integer a = 1;  // {Integer@479}
Integer b = 2;  // {Integer@480}

You've decided to use a as an accumulator.

a += b;

The value a is currently holding satisfies you. It's 3. However, a no longer refers to that {Integer@479} you used to have at the beginning.

I added debug statements to your Collector and make things clear.

public static  <T> Collector<T, Integer, Integer> summingInt(ToIntFunction<? super T> mapper) {
  return Collector.of(
      () -> {
        Integer zero = 0;
        System.out.printf("init [%d (%d)]\n", zero, System.identityHashCode(zero));
        return zero;
      },
      (a, t) -> {
        System.out.printf("-> accumulate [%d (%d)]\n", a, System.identityHashCode(a));
        a += mapper.applyAsInt(t);
        System.out.printf("<- accumulate [%d (%d)]\n", a, System.identityHashCode(a));
      },
      (a, b) -> a += b,
      a -> a
  );
}

If you use it, you'll notice a pattern like

init [0 (6566818)]
-> accumulate [0 (6566818)]
<- accumulate [1 (1029991479)]
-> accumulate [0 (6566818)]
<- accumulate [2 (1104106489)]
-> accumulate [0 (6566818)]
<- accumulate [3 (94438417)]

where 0 (6566818) is not being changed despite all abortive attempts with +=.

If you rewrote it to using an AtomicInteger

public static  <T> Collector<T, AtomicInteger, AtomicInteger> summingInt(ToIntFunction<? super T> mapper) {
  return Collector.of(
      () -> {
        AtomicInteger zero = new AtomicInteger();
        System.out.printf("init [%d (%d)]\n", zero.get(), System.identityHashCode(zero));
        return zero;
      },
      (a, t) -> {
        System.out.printf("-> accumulate [%d (%d)]\n", a.get(), System.identityHashCode(a));
        a.addAndGet(mapper.applyAsInt(t));
        System.out.printf("<- accumulate [%d (%d)]\n", a.get(), System.identityHashCode(a));
      },
      (a, b) -> { a.addAndGet(b.get()); return a;}
  );
}

you would be seeing a true accumulator (as a part of mutable reduction) in action

init [0 (1494279232)]
-> accumulate [0 (1494279232)]
<- accumulate [1 (1494279232)]
-> accumulate [1 (1494279232)]
<- accumulate [3 (1494279232)]
-> accumulate [3 (1494279232)]
<- accumulate [6 (1494279232)]