Search code examples
javajava-streampredicatepredicatebuilder

Extract Objects from Collection<Objects> based on a Map input


I have a collection of around 13360 Account objects, something like below

Input Data:

Account(id,date,balance,region,cost)
Account("1","2019-07-24","X,"Y","Z")
Account("1","2019-07-24","C,"Y","Z")
Account("1","2019-07-23","X,"D","Z")
Account("1","2019-07-23","X,"Y","f")
Account("1","2019-07-22","X,"s","Z")

Account("2","2019-07-23","X,"A","Z")
Account("2","2019-07-23","X,"Y","d")
Account("2","2019-07-22","d,"Y","Z")
Account("2","2019-07-23","X,"s","Z")

Account("3","2019-07-24","d,"Y","d")

Account("4","2019-07-24","X,"Y","Z")
Account("4","2019-07-23","d,"Y","Z")

Account("5","2019-07-23","X,"d","Z")
Account("5","2019-07-22","X,"Y","Z")

Filter criteria:

Map<id,date>
(1,2019-07-24), (2,2019-07-23),(5,2019-07-23)

RESULT Expected is

Account("1","2019-07-24","X,"Y","Z")
Account("1","2019-07-24","C,"Y","Z")

Account("2","2019-07-23","X,"A","Z")
Account("2","2019-07-23","X,"Y","d")
Account("2","2019-07-23","X,"s","Z")

Account("5","2019-07-23","X,"d","Z")

Thus I want to retrieve the most recent data for certain accounts

The below code sample just gives me data for the most recent date ie(today's dates) for a certain list of accounts. but for certain accounts, I do not have data for today's date, so I need to retrive the most recent data that is available

EntryObject eo = new PredicateBuilder.getEntryObject();
Predicate p = eo.get("id").in(1,2,5).and(eo.get("date").equals(todaysdate))
Collection<Account> coll = accounts.values(p);

Solution

  • Here is a more generic solution to get only the latest entries for a set of accountIds:

    Set<Integer> accountIds = Set.of(1, 2, 5);
    List<Account> result = accounts.stream()
            .filter(a -> accountIds.contains(a.getId()))
            .collect(Collectors.groupingBy(Account::getId, Collectors.groupingBy(Account::getDate, TreeMap::new, Collectors.toList())))
            .values().stream()
            .flatMap(m -> m.lastEntry().getValue().stream())
            .collect(Collectors.toList());
    

    First you filter only the required accounts based on the id. After that you group them by id and date, which will give you this intermediate result:

    {
      1: {
        2019-07-22: [{id: 1, date: 2019-07-22, balance: 'X', region: 's', cost: 'Z'}], 
        2019-07-23: [{id: 1, date: 2019-07-23, balance: 'X', region: 'D', cost: 'Z'}, {id: 1, date: 2019-07-23, balance: 'X', region: 'Y', cost: 'f'}],
        2019-07-24: [{id: 1, date: 2019-07-24, balance: 'X', region: 'Y', cost: 'Z'}, {id: 1, date: 2019-07-24, balance: 'C', region: 'Y', cost: 'Z'}]
      }, 
      2: {
        2019-07-22: [{id: 2, date: 2019-07-22, balance: 'd', region: 'Y', cost: 'Z'}], 
        2019-07-23: [{id: 2, date: 2019-07-23, balance: 'X', region: 'A', cost: 'Z'}, {id: 2, date: 2019-07-23, balance: 'X', region: 'Y', cost: 'd'}, {id: 2, date: 2019-07-23, balance: 'X', region: 's', cost: 'Z'}]
      }, 
      5: {
        2019-07-22: [{id: 5, date: 2019-07-22, balance: 'X', region: 'Y', cost: 'Z'}], 
        2019-07-23: [{id: 5, date: 2019-07-23, balance: 'X', region: 'd', cost: 'Z'}]
      }
    }
    

    Finally you use only the values of the resulting map and flatMap it to the last value of the grouped TreeMap to get only the list with the latest dates.

    The final result will be this:

    [
      {id: 1, date: 2019-07-24, balance: 'X', region: 'Y', cost: 'Z'}, 
      {id: 1, date: 2019-07-24, balance: 'C', region: 'Y', cost: 'Z'}, 
    
      {id: 2, date: 2019-07-23, balance: 'X', region: 'A', cost: 'Z'}, 
      {id: 2, date: 2019-07-23, balance: 'X', region: 'Y', cost: 'd'}, 
      {id: 2, date: 2019-07-23, balance: 'X', region: 's', cost: 'Z'},
    
      {id: 5, date: 2019-07-23, balance: 'X', region: 'd', cost: 'Z'}
    ]