Search code examples
javajava-streamcollectors

How to consume more than one Stream with the "same" Collector, without concatenating them?


Imagine we have a Collector, and we want to feed it the contents of a succession of Streams.

The most natural way of doing it would be concatenating the Streams and feeding the Collector with the concatenation. But this might not be optimal: for example, if each Stream reads from a scarce resource allocated with a try-with-resources, it would be expensive to have all Streams at once.

Also, sometimes we might not even have direct access to the Streams, we might only have an opaque method that "feeds" a Collector that it receives as parameter, and returns the result.

How to feed a Collector from multiple sources in those cases?


Solution

  • Instead of concatenating Streams that have been previously allocated with a try-with-resources, a possible solution is to splice them in a "top-level" Stream using flatMap, and then consume the resulting Stream with the Collector.

    The usual recommendation for safe resource handling with Streams is to use try-with-resources. However, flatMap behaves specially in that respect: it itself ensures that the spliced sub-Streams are closed, both when they are "exhausted" in the main Stream, and when the Stream is interrupted because of an exception.

    To my mind, the wording in the flatMap javadocs feels a bit ambiguous about cleanup in the face of exceptions:

    Each mapped stream is closed after its contents have been placed into this stream.

    But this experiments show that sub-Streams are closed even when an exception crops up:

    // This prints "closed!" before the stack trace
    Stream.of(1,2,3)
    .flatMap((i) ->
         Stream.<String>generate(() -> { throw new RuntimeException(); })
         .limit(2)
         .onClose(() -> System.err.println("closed!"))
    ).forEach(System.err::println);