Search code examples
javajava-streamside-effects

Alternative to peek in Java streams


There are lots of questions regarding peek in the Java Streams API. I'm looking for a way to complete the following common pattern using Java Streams. I can make it work with Streams, but it is non-obvious which means slightly dangerous without a comment which it is not ideal.

boolean anyPricingComponentsChanged = false;
for (var pc : plan.getPricingComponents()) {
    if (pc.getValidTill() == null || pc.getValidTill().compareTo(dateNow) <= 0) {
        anyPricingComponentsChanged = true;
        pc.setValidTill(dateNow);
    }
}

My option:

long numberChanged = plan.getPricingComponents()
    .stream()
    .filter(pc -> pc.getValidTill() == null || pc.getValidTill().compareTo(dateNow) <= 0)
    .peek(pc -> pc.setValidTill(dateNow))
    .count(); //`count` rather than `findAny` to ensure that `peek` processes all components.

boolean anyPricingComponentsChanged = numberChanged != 0;

As an aside, whilst compareTo is not an expensive operation here and consistently returns the same result, in other cases this might not be true, and I'd rather avoid running it multiple times for this pattern.


Solution

  • // to ensure that peek processes all components

    You can't really ensure that peek() would process all the stream elements that should be modified. In some cases, this operation can be elided from the pipeline, and you should not perform any important actions via peek().

    Here's a quote from the documenation of peek():

    API Note:

    This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline ...

    In cases where the stream implementation is able to optimize away the production of some or all the elements (such as with short-circuiting operations like findFirst, or in the example described in count()), the action will not be invoked for those elements.

    Also, here's what Stream API documentation says regarding Side-effects:

    If the behavioral parameters do have side-effects, unless explicitly stated, there are no guarantees as to:

    • the visibility of those side-effects to other threads;
    • that different operations on the "same" element within the same stream pipeline are executed in the same thread; and
    • that behavioral parameters are always invoked, since a stream implementation is free to elide operations (or entire stages) from a stream pipeline if it can prove that it would not affect the result of the computation.

    ...

    The eliding of side-effects may also be surprising. With the exception of terminal operations forEach and forEachOrdered, side-effects of behavioral parameters may not always be executed when the stream implementation can optimize away the execution of behavioral parameters without affecting the result of the computation. (For a specific example see the API note documented on the count operation.)

    Amphesys added

    Since peek is not meant to contribute to the result of the stream execution Stream implementations are free to throw it away.

    Instead of relying on peek() you can do the following:

    List<PricingComponent> componentsToChange = plan.getPricingComponents()
        .stream()
        .filter(pc -> pc.getValidTill() == null || pc.getValidTill().compareTo(dateNow) <= 0)
        .toList();
        
    componentsToChange.forEach(pc -> pc.setValidTill(dateNow));
    
    boolean anyPricingComponentsChanged = componentsToChange.size() != 0;
    

    If you don't want to materialize the objects that need to be modified as a List, then stick with a for-loop.

    Note

    • The quotes above from the API documentation like "stream implementation is free to elide operations (or entire stages) from a stream pipeline if it can prove that it would not affect the result of the computation" are applicable to any intermediate operation having an embedded side-effect. Either a side-effect can be elided, or the whole pipeline stage (stream operation) optimized away if it has no impact on the result. And to be on the same page regurding the terminology, in short, side-effect - is anything that a function does apart from producing the required result (e.g. i -> { side-effect; return i * 2; })

    • Although it's not advisable to assign peek() with an action which should be executed at any circumstances, at least is choice doesn't contradicts the semantics of peek. To the contrary, performing side-effects via filter, map, or other operation which are not designed to operate through side-effects not only doesn't resolve the problem, but is also weird since it goes against the semantics of these operations and violates the Principle of least astonishment.