Search code examples
javajava-streamjava-9

Stream takeWhile for a sorted stream in the same pipeline


I'm trying to filter a sorted stream using java9´s takeWhile to get objects from the beginning of the stream that share the same value for a field. I am unable to write the right predicate to do so. This can be done using two steps breaking the pipeline of the stream.

Stream.of(objA, objB, objC, objD, objE, objF, objG)
    .takeWhile(" get the value of objA´s field and take 
   as long as as other objects value for that field is the same as objA´s");

In two steps I could do something like

int x = Stream.of(objA, objB, objC, objD, objE, objF, objG).findFirst().get().getSomeValue();

Stream.of(objA, objB, objC, objD, objE, objF, objG).takeWhile(e -> e.getSomeValue() == x);

A simplified example could be

Stream.of(5,5,5,5,13,14,5,5,2,5,5,6)
.takeWhile(get the first four '5'´s)

Can this be done without the intermediate step using optional.get?


Solution

  • A (bit clumsy) workaround is to use a mutable holder object (because the reference must be effectively final) that contains either null or the first value. We'll use AtomicReference in this example.

    AtomicReference<MyClass> holder = new AtomicReference<>();
    
    sortedStream.takeWhile(e -> holder.get() == null || holder.get().equals(e))
    .map(e -> { holder.set(e); return e; })
    .collect(Collectors.toList());
    

    Now this still contains an extra step, but as a Stream can only be processed once, the findFirst() approach won't work anyway. This also keeps setting the holder object even if it has been set, but that's just a minor annoyance rather than a problem.

    As Holger pointed out, a better and more streamlined version would be

    sortedStream.takeWhile(e -> holder.compareAndSet(null, e.getSomeValue()) || e.getSomeValue() == holder.get())
    .collect(Collectors.toList());
    

    where with the first element compareAndSet assigns the value, returning true, and the subsequent elements are compared against the holder's value.

    If you do use this approach, I recommend adding a comment pointing to this question/answer, or at least somehow try to explain what this code achieves, as it may not be immediately obvious to people reading the code.