Search code examples
javajava-8java-stream

Java Streams – How to group by value and find min and max value of each group?


For my example, having car object and found that min and max price value based on model (group by).

List<Car> carsDetails = UserDB.getCarsDetails();
Map<String, DoubleSummaryStatistics> collect4 = carsDetails.stream()
                .collect(Collectors.groupingBy(Car::getMake, Collectors.summarizingDouble(Car::getPrice)));
collect4.entrySet().forEach(e->System.out.println(e.getKey()+" "+e.getValue().getMax()+" "+e.getValue().getMin()));

output :
Lexus 94837.79 17569.59
Subaru 96583.25 8498.41
Chevrolet 99892.59 6861.85

But I couldn't find which car objects have max and min price. How can I do that?


Solution

  • If you were interested in only one Car per group, you could use, e.g.

    Map<String, Car> mostExpensives = carsDetails.stream()
        .collect(Collectors.toMap(Car::getMake, Function.identity(),
            BinaryOperator.maxBy(Comparator.comparing(Car::getPrice))));
    mostExpensives.forEach((make,car) -> System.out.println(make+" "+car));
    

    But since you want the most expensive and the cheapest, you need something like this:

    Map<String, List<Car>> mostExpensivesAndCheapest = carsDetails.stream()
        .collect(Collectors.toMap(Car::getMake, car -> Arrays.asList(car, car),
            (l1,l2) -> Arrays.asList(
                (l1.get(0).getPrice()>l2.get(0).getPrice()? l2: l1).get(0),
                (l1.get(1).getPrice()<l2.get(1).getPrice()? l2: l1).get(1))));
    mostExpensivesAndCheapest.forEach((make,cars) -> System.out.println(make
            +" cheapest: "+cars.get(0)+" most expensive: "+cars.get(1)));
    

    This solution bears a bit of inconvenience due to the fact that there is no generic statistics object equivalent to DoubleSummaryStatistics. If this happens more than once, it’s worth filling the gap with a class like this:

    /**
     * Like {@code DoubleSummaryStatistics}, {@code IntSummaryStatistics}, and
     * {@code LongSummaryStatistics}, but for an arbitrary type {@code T}.
     */
    public class SummaryStatistics<T> implements Consumer<T> {
        /**
         * Collect to a {@code SummaryStatistics} for natural order.
         */
        public static <T extends Comparable<? super T>> Collector<T,?,SummaryStatistics<T>>
                      statistics() {
            return statistics(Comparator.<T>naturalOrder());
        }
        /**
         * Collect to a {@code SummaryStatistics} using the specified comparator.
         */
        public static <T> Collector<T,?,SummaryStatistics<T>>
                      statistics(Comparator<T> comparator) {
            Objects.requireNonNull(comparator);
            return Collector.of(() -> new SummaryStatistics<>(comparator),
                SummaryStatistics::accept, SummaryStatistics::merge);
        }
        private final Comparator<T> c;
        private T min, max;
        private long count;
        public SummaryStatistics(Comparator<T> comparator) {
            c = Objects.requireNonNull(comparator);
        }
    
        public void accept(T t) {
            if(count == 0) {
                count = 1;
                min = t;
                max = t;
            }
            else {
                if(c.compare(min, t) > 0) min = t;
                if(c.compare(max, t) < 0) max = t;
                count++;
            }
        }
        public SummaryStatistics<T> merge(SummaryStatistics<T> s) {
            if(s.count > 0) {
                if(count == 0) {
                    count = s.count;
                    min = s.min;
                    max = s.max;
                }
                else {
                    if(c.compare(min, s.min) > 0) min = s.min;
                    if(c.compare(max, s.max) < 0) max = s.max;
                    count += s.count;
                }
            }
            return this;
        }
    
        public long getCount() {
            return count;
        }
    
        public T getMin() {
            return min;
        }
    
        public T getMax() {
            return max;
        }
    
        @Override
        public String toString() {
            return count == 0? "empty": (count+" elements between "+min+" and "+max);
        }
    }
    

    After adding this to your code base, you may use it like

    Map<String, SummaryStatistics<Car>> mostExpensives = carsDetails.stream()
        .collect(Collectors.groupingBy(Car::getMake,
            SummaryStatistics.statistics(Comparator.comparing(Car::getPrice))));
    mostExpensives.forEach((make,cars) -> System.out.println(make+": "+cars));
    

    If getPrice returns double, it may be more efficient to use Comparator.comparingDouble(Car::getPrice) instead of Comparator.comparing(Car::getPrice).