I have read that the streams are evaluated lazily in the pipeline. It is not executed unless the terminal opertion is invoked on the expression. But, what are the benefits of this lazy evaluation of the expressions? Can someone please explain with some examples
Lazy processing allows for several useful improvements at, essentially, zero cost:
For simplicity, consider some finite stream like a list with n
element being streamed. Now consider something like:
Stream<SomeThing> resultStream = someList.stream();
if (condition1){
resultStream = resultStream.filter(predicate1);
}
if (condition2){
resultStream = resultStream.filter(predicate2);
}
if (condition3){
resultStream = resultStream.filter(predicate3);
}
List<SomeThing> result = resultStream.collect(Collectors.toList());
Since the filters are lazily applied this will result in a single iteration through the n
elements in the list. This is exactly an O(n)
operation. However, filtering the same list eagerly three times will apply three O(n)
operations on the list. This is more processing than the lazy evaluation does.
Since the source of the stream does not need to be static data (like a list), it can be a dynamic source like an event emitter, reading data over a socket or anything else, then the stream can be prepared in advanced and consumed when needed.
For example, consider an application that attaches to a socket but does not need the data coming in right away. Perhaps only when some condition is met (5 minutes have passed, or some signal is received, or an error occurs, etc.) then the information is to be pulled from the socket. The application will just need to execute a terminal operation only at that point. Until then there are no CPU cycles or network bandwidth consumed for the stream.
Eager evaluation of a stream cannot handle infinite data sources. This would require infinite memory and processing time which, by definition, is not there. More practically, one has to be extremely careful in setting up expectations before hand.
That is where lazy evaluation can be indispensable, as it handles both finite and infinite data sources in the same idiomatic way. With zero cost. One can even change a stream's source from finite to infinite with little to no changes needed. Changing an eager operation to lazy is non-trivial otherwise.