Search code examples
javajava-streamcollectors

Group all dates that are no more than 90 days apart


I have a situation like this:

@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class ObjectA {
     int id;
     LocalDate date;
     int priorityA;
     int priorityB;
}

List<ObjectA> list = List.of(
        new ObjectA(1, LocalDate.parse("2021-09-22"), 2, 1),
        new ObjectA(2, LocalDate.parse("2021-09-22"), 2, 1),
        new ObjectA(3, LocalDate.parse("2022-09-22"), 2, 1),
        new ObjectA(4, LocalDate.parse("2022-10-22"), 2, 1)
);

I have to collapse these objects that are no more than 90 days apart.

To choose which of the two wins, check:

  • first who has the largest "priorityA" attribute, if they are equal then
  • check "priorityB", if this is also the same
  • check the lesser date and if that is also the same then
  • the one with the lesser id.

My implementation is as follows:

    List<ObjectA> collapsed = list.stream()
            .collect(Collectors.collectingAndThen(
            Collectors.groupingBy(
                    ObjectA::getDate,
                    Collectors.maxBy(Comparator
                            .comparing(ObjectA::getPriorityA)
                            .thenComparing(ObjectA::getPriorityB)
                            .thenComparing(ObjectA::getDate)
                            .thenComparing(ObjectA::getId))),
            map -> map.values().stream()
                    .filter(Optional::isPresent)
                    .map(Optional::get)
                    .collect(Collectors.toList())));

But with this implementation I have two problems:

  1. I only group objects that have the same date, instead I have to be able to group those that are in the same range of dates, those whose dates are no more than 90 days apart.
  2. I can't combine the maxBy I need for the priorities (priorityA, priorityB) and the minBy of data and id.

With proper implementation I should have something like Example:

ObjectA(1, LocalDate.parse("2021-09-22"), 2, 1)
ObjectA(2, LocalDate.parse("2021-09-22"), 2, 1)
ObjectA(3, LocalDate.parse("2022-09-22"), 2, 1)
ObjectA(4, LocalDate.parse("2022-10-22"), 2, 1)
   that became
ObjectA(1, LocalDate.parse("2021-09-22"), 2, 1)
ObjectA(3, LocalDate.parse("2022-09-22"), 2, 1)

Because they have the same priorities and dates in the first pair, so i take the one with smallest id - and in the second pair they have the same priorities but different dates so i take the one with smallest date.

Could someone with a good heart help me? I have already asked other similar questions here on the forum, but I did not explain myself well enough and the answers however correct were not satisfactory.

EDIT: "What if you have dates like A=2022-01-01, B=2022-03-01, C=2022-05-01? A and B are within 90 days, and B and C are within 90 days, but A and C are outside of the threshold. Do you group A and B together or B and C together, and why?"

  • In this case it will be enough for me to take the first pair, collapsing that the second loses its meaning. The ultimate goal is simply that there are no objects with dates closer than 90 days.

Another example:

  • A = 1 priority (01/01/2022)
  • B = 9 priority (01/03/2022)
  • C = 8 priority (01/05/2022)

A-B take B, B-C take B. So with that 3 I will take only B.

Maybe the solution with stream, collect and Collectors is not the best for this case?

EDIT: That's another way to solve it, how can I improve it?

// added a boolean in ObjectA -> "dropped" default false

for(int i = 0; i < list.size(); i++) {
            for (int j = 1; j < list.size(); j++) {
                if(list.get(i) == list.get(j) && (list.get(i).dropped || list.get(j).dropped))
                    continue;

                long diff = ChronoUnit.DAYS.between(list.get(i).getDate(), list.get(j).getDate());
                if(diff <= daysWindow && diff >= 0 && list.get(i) != list.get(j)) {
                    if (list.get(i).getPriorityA() != list.get(j).getPriorityA()) {
                        if (list.get(i).getPriorityA() > list.get(j).getPriorityA()) {
                            list.get(j).setDropped(true);
                        } else {
                            list.get(i).setDropped(true);
                        }
                    } else if (list.get(i).getPriorityB() != list.get(j).getPriorityB()) {
                        if (list.get(i).getPriorityB() > list.get(j).getPriorityB()) {
                            list.get(j).setDropped(true);
                        } else {
                            list.get(i).setDropped(true);
                        }
                    } else if (list.get(i).getDate().compareTo(list.get(j).getDate()) != 0){
                        if (list.get(i).getDate().compareTo(list.get(j).getDate()) > 0) {
                            list.get(i).setDropped(true);
                        } else {
                            list.get(j).setDropped(true);
                        }
                    } else {
                        if (list.get(i).getId()>list.get(j).getId()) {
                            list.get(i).setDropped(true);
                        } else {
                            list.get(j).setDropped(true);
                        }
                    }
                }
            }
        }

 // then i filter for take all objectA with dropped false. 

Thank you all for the help you can give.


Solution

  • If I understand it correctly it is a two-step sorting problem you have there. First, you need to sort your list by date to be able to group your list of items in a 90-day interval and then re-sort the groups according to your priority requirements.

    I would first sort the list by date and remember the date of the first object in an atomic reference. This reference can then be used to form the groups by looking at each iteration to see if the date of the current object is less than 90 days, if so it belongs to the current group otherwise the reference is updated and a new group is formed with the new date as the key. Then you sort the elements of each group and take the first element from each group. To see how to sort groups resulting from the groupingBy collector, see this post sorting-lists-after-groupingby.

    Example code, which I have supplemented with additional list elements:

    import java.time.LocalDate;
    import java.util.ArrayList;
    import java.util.Comparator;
    import java.util.LinkedHashMap;
    import java.util.List;
    import java.util.TreeSet;
    import java.util.concurrent.atomic.AtomicReference;
    import java.util.stream.Collector;
    import java.util.stream.Collectors;
    
    import lombok.AllArgsConstructor;
    import lombok.Getter;
    import lombok.NoArgsConstructor;
    import lombok.Setter;
    import lombok.ToString;
    
    import static java.time.temporal.ChronoUnit.DAYS;
    
    public class Example {
        public static void main(String args[]) {
    
            List<ObjectA> list = List.of(
                    new ObjectA(1, LocalDate.parse("2021-09-22"), 2, 1),
                    new ObjectA(2, LocalDate.parse("2021-09-22"), 2, 1),
    
                    new ObjectA(3, LocalDate.parse("2022-09-22"), 2, 1),
                    new ObjectA(4, LocalDate.parse("2022-10-22"), 2, 1),
    
                    new ObjectA(10, LocalDate.parse("2025-09-22"), 2, 1),
                    new ObjectA(20, LocalDate.parse("2025-09-22"), 2, 1),
                    new ObjectA(30, LocalDate.parse("2025-09-23"), 2, 1),
    
                    new ObjectA(40, LocalDate.parse("2029-09-22"), 2, 1),
                    new ObjectA(13, LocalDate.parse("2029-09-22"), 2, 1),
                    new ObjectA(23, LocalDate.parse("2029-09-22"), 2, 1),
                    new ObjectA(33, LocalDate.parse("2029-09-22"), 2, 1),
                    new ObjectA(4, LocalDate.parse("2029-09-22"), 2, 1)
            );
    
    
    
            // for a better overview single comparators which are used as own variables
            Comparator<ObjectA> byPrioA = Comparator.comparing(ObjectA::getPriorityA, Comparator.reverseOrder());
            Comparator<ObjectA> byPrioB = Comparator.comparing(ObjectA::getPriorityB, Comparator.reverseOrder());
            Comparator<ObjectA> byDate  = Comparator.comparing(ObjectA::getDate);
            Comparator<ObjectA> byId    = Comparator.comparing(ObjectA::getId);
    
            //first who has the largest "priorityA" attribute, if they are equal then
            //check "priorityB", if this is also the same
            //check the lesser date and if that is also the same then
            //the one with the lesser id.
            Comparator<ObjectA> combined = byPrioA.thenComparing(byPrioB).thenComparing(byDate).thenComparing(byId);
    
            //sort list by date
            List<ObjectA> sortedByDate = list.stream().sorted(byDate).collect(Collectors.toList());
    
            //store first date for first group key
            AtomicReference<LocalDate> ar = new AtomicReference<>(sortedByDate.get(0).getDate());
    
            sortedByDate.stream().collect(Collectors.groupingBy(
                    d -> DAYS.between(ar.get(), d.getDate()) < 90 ? ar.get() : ar.accumulateAndGet(d.getDate(), (u,v) ->v),
                    LinkedHashMap::new,
                    Collectors.collectingAndThen(toSortedList(combined), l -> l.get(0))))
                    .values()
                    .forEach(System.out::println);
        }
    
        //https://stackoverflow.com/questions/35872236/sorting-lists-after-groupingby
        static <T> Collector<T,?,List<T>> toSortedList(Comparator<? super T> c) {
            return Collectors.collectingAndThen(
                    Collectors.toCollection(()->new TreeSet<>(c)), ArrayList::new);
        }
    
        @Getter
        @Setter
        @AllArgsConstructor
        @NoArgsConstructor
        @ToString
        public static class ObjectA {
            int id;
            LocalDate date;
            int priorityA;
            int priorityB;
        }
    }