I have a collection of around 13360 Account objects, something like below
Input Data:
Account(id,date,balance,region,cost)
Account("1","2019-07-24","X,"Y","Z")
Account("1","2019-07-24","C,"Y","Z")
Account("1","2019-07-23","X,"D","Z")
Account("1","2019-07-23","X,"Y","f")
Account("1","2019-07-22","X,"s","Z")
Account("2","2019-07-23","X,"A","Z")
Account("2","2019-07-23","X,"Y","d")
Account("2","2019-07-22","d,"Y","Z")
Account("2","2019-07-23","X,"s","Z")
Account("3","2019-07-24","d,"Y","d")
Account("4","2019-07-24","X,"Y","Z")
Account("4","2019-07-23","d,"Y","Z")
Account("5","2019-07-23","X,"d","Z")
Account("5","2019-07-22","X,"Y","Z")
Filter criteria:
Map<id,date>
(1,2019-07-24), (2,2019-07-23),(5,2019-07-23)
RESULT Expected is
Account("1","2019-07-24","X,"Y","Z")
Account("1","2019-07-24","C,"Y","Z")
Account("2","2019-07-23","X,"A","Z")
Account("2","2019-07-23","X,"Y","d")
Account("2","2019-07-23","X,"s","Z")
Account("5","2019-07-23","X,"d","Z")
Thus I want to retrieve the most recent data for certain accounts
The below code sample just gives me data for the most recent date ie(today's dates) for a certain list of accounts. but for certain accounts, I do not have data for today's date, so I need to retrive the most recent data that is available
EntryObject eo = new PredicateBuilder.getEntryObject();
Predicate p = eo.get("id").in(1,2,5).and(eo.get("date").equals(todaysdate))
Collection<Account> coll = accounts.values(p);
Here is a more generic solution to get only the latest entries for a set of accountIds
:
Set<Integer> accountIds = Set.of(1, 2, 5);
List<Account> result = accounts.stream()
.filter(a -> accountIds.contains(a.getId()))
.collect(Collectors.groupingBy(Account::getId, Collectors.groupingBy(Account::getDate, TreeMap::new, Collectors.toList())))
.values().stream()
.flatMap(m -> m.lastEntry().getValue().stream())
.collect(Collectors.toList());
First you filter only the required accounts based on the id. After that you group them by id and date, which will give you this intermediate result:
{
1: {
2019-07-22: [{id: 1, date: 2019-07-22, balance: 'X', region: 's', cost: 'Z'}],
2019-07-23: [{id: 1, date: 2019-07-23, balance: 'X', region: 'D', cost: 'Z'}, {id: 1, date: 2019-07-23, balance: 'X', region: 'Y', cost: 'f'}],
2019-07-24: [{id: 1, date: 2019-07-24, balance: 'X', region: 'Y', cost: 'Z'}, {id: 1, date: 2019-07-24, balance: 'C', region: 'Y', cost: 'Z'}]
},
2: {
2019-07-22: [{id: 2, date: 2019-07-22, balance: 'd', region: 'Y', cost: 'Z'}],
2019-07-23: [{id: 2, date: 2019-07-23, balance: 'X', region: 'A', cost: 'Z'}, {id: 2, date: 2019-07-23, balance: 'X', region: 'Y', cost: 'd'}, {id: 2, date: 2019-07-23, balance: 'X', region: 's', cost: 'Z'}]
},
5: {
2019-07-22: [{id: 5, date: 2019-07-22, balance: 'X', region: 'Y', cost: 'Z'}],
2019-07-23: [{id: 5, date: 2019-07-23, balance: 'X', region: 'd', cost: 'Z'}]
}
}
Finally you use only the values of the resulting map and flatMap
it to the last value of the grouped TreeMap to get only the list with the latest dates.
The final result will be this:
[
{id: 1, date: 2019-07-24, balance: 'X', region: 'Y', cost: 'Z'},
{id: 1, date: 2019-07-24, balance: 'C', region: 'Y', cost: 'Z'},
{id: 2, date: 2019-07-23, balance: 'X', region: 'A', cost: 'Z'},
{id: 2, date: 2019-07-23, balance: 'X', region: 'Y', cost: 'd'},
{id: 2, date: 2019-07-23, balance: 'X', region: 's', cost: 'Z'},
{id: 5, date: 2019-07-23, balance: 'X', region: 'd', cost: 'Z'}
]