Search code examples
javajava-8java-stream

parallelStream vs stream.parallel


I have been curious about the difference between Collections.parallelStream() and Collections.stream().parallel(). According to the Javadocs, parallelStream() tries to return a parallel stream, whereas stream().parallel() returns a parallel stream. Through some testing of my own, I have found no differences. Where does the difference in these two methods lie? Is one implementation more time efficient than another? Thanks.


Solution

  • Even if they act the same at the moment, there is a difference - at least in their documentation, as you correctly pointed out; that might be exploited in the future as far as I can tell.

    At the moment the parallelStream method is defined in the Collection interface as:

    default Stream<E> parallelStream() {
        return StreamSupport.stream(spliterator(), true);
    }
    

    Being a default method it could be overridden in implementations (and that's what Collections inner classes actually do).

    That hints that even if the default method returns a parallel Stream, there could be Collections that override this method to return a non-parallel Stream. That is the reason the documentation is probably the way it is.

    At the same time even if parallelStream returns a sequential stream - it is still a Stream, and then you could easily call parallel on it:

      Collections.some()
           .parallelStream() // actually sequential
           .parallel() // force it to be parallel
    

    At least for me, this looks weird.

    It seems that the documentation should somehow state that after calling parallelStream there should be no reason to call parallel again to force that - since it might be useless or even bad for the processing.

    EDIT

    For anyone reading this - please read the comments by Holger also; it covers cases beyond what I said in this answer.