In official documentation you can read that:
UNORDERED
Indicates that the collection operation does not commit to preserving the encounter order of input elements.
This is not too helpful without any examples.
My question is, what exactly does UNORDERED
characteristic mean? Should I use it with reducing collectors like min or sum or is it only applicable to collection collectors?
In OpenJDK looks like reducing operations (min, sum, avg) have empty characteristics. I expected to find there at least CONCURRENT
and UNORDERED
.
UNORDERED
essentially means that the collector is both associative (required by the spec) and commutative (not required).
Associativity allows splitting the computation into subparts and then combining them into the full result, but requires the combining step to be strictly ordered. Examine this snippet from the docs:
A a2 = supplier.get();
accumulator.accept(a2, t1);
A a3 = supplier.get();
accumulator.accept(a3, t2);
R r2 = finisher.apply(combiner.apply(a2, a3)); // result with splitting
In the last step, combiner.apply(a2, a3)
, the arguments must appear in exactly this order, which means that the entire computation pipeline must track the order and respect it in the end.
Another way of saying this is that the tree we get from recursive splitting must be ordered.
On the other hand, if the combining operation is commutative, we can combine any subpart with any other, in no particular order, and always obtain the same result. Clearly this leads to many optimization opportunities in both space and time dimensions.
It should be noted that there are UNORDERED
collectors in the JDK which don't guarantee commutativity. The main category are the "higher-order" collectors which are composed with other downstream collectors, but they don't enforce the UNORDERED
property on them.