Search code examples
javascalajava-8java-stream

Split list into multiple lists with fixed number of elements in Java 8


I want to something which is similar to the Scala grouped function. Basically, pick 2 elements at a time and process them. Here is a reference for the same:

Split list into multiple lists with fixed number of elements

Lambdas do provide things like groupingBy and partitioningBy but none of them seem to do the same as the grouped function in Scala. Any pointers would be appreciated.


Solution

  • It sounds like a problem that is better handled like a low-level Stream operation just like the ops provided by the Stream API itself. A (relative) simple solution may look like:

    public static <T> Stream<List<T>> chunked(Stream<T> s, int chunkSize) {
        if(chunkSize<1) throw new IllegalArgumentException("chunkSize=="+chunkSize);
        if(chunkSize==1) return s.map(Collections::singletonList);
        Spliterator<T> src=s.spliterator();
        long size=src.estimateSize();
        if(size!=Long.MAX_VALUE) size=(size+chunkSize-1)/chunkSize;
        int ch=src.characteristics();
        ch&=Spliterator.SIZED|Spliterator.ORDERED|Spliterator.DISTINCT|Spliterator.IMMUTABLE;
        ch|=Spliterator.NONNULL;
        return StreamSupport.stream(new Spliterators.AbstractSpliterator<List<T>>(size, ch)
        {
            private List<T> current;
            @Override
            public boolean tryAdvance(Consumer<? super List<T>> action) {
                if(current==null) current=new ArrayList<>(chunkSize);
                while(current.size()<chunkSize && src.tryAdvance(current::add));
                if(!current.isEmpty()) {
                    action.accept(current);
                    current=null;
                    return true;
                }
                return false;
            }
        }, s.isParallel());
    }
    

    Simple test:

    chunked(Stream.of(1, 2, 3, 4, 5, 6, 7), 3)
      .parallel().forEachOrdered(System.out::println);
    

    The advantage is that you do not need a full collection of all items for subsequent stream processing, e.g.

    chunked(
        IntStream.range(0, 1000).mapToObj(i -> {
            System.out.println("processing item "+i);
            return i;
        }), 2).anyMatch(list->list.toString().equals("[6, 7]")));
    

    will print:

    processing item 0
    processing item 1
    processing item 2
    processing item 3
    processing item 4
    processing item 5
    processing item 6
    processing item 7
    true
    

    rather than processing a thousand items of IntStream.range(0, 1000). This also enables using infinite source Streams:

    chunked(Stream.iterate(0, i->i+1), 2).anyMatch(list->list.toString().equals("[6, 7]")));
    

    If you are interested in a fully materialized collection rather than applying subsequent Stream operations, you may simply use the following operation:

    List<Integer> list=Arrays.asList(1, 2, 3, 4, 5, 6, 7);
    int listSize=list.size(), chunkSize=2;
    List<List<Integer>> list2=
        IntStream.range(0, (listSize-1)/chunkSize+1)
                 .mapToObj(i->list.subList(i*=chunkSize,
                                           listSize-chunkSize>=i? i+chunkSize: listSize))
                 .collect(Collectors.toList());