Assume we have a person class with fields:
Class Person {
private String name;
private Integer id (this one is unique);
}
And then we have a List<Person> people
such that:
['Jerry', 993]
['Tom', 3]
['Neal', 443]
['Jerry', 112]
['Shannon', 259]
['Shannon', 533]
How can I make a new List<Person> uniqueNames
such that it filters for unique names only AND keeps the highest ID of that name.
So the end list would look like:
['Jerry', 993]
['Tom', 3]
['Neal', 443]
['Shannon', 533]
Collectors.groupingBy
+ Collectors.maxBy
should do the trick to build the map of persons grouped by name and then selecting the max value:
List<Person> persons = Arrays.asList(
new Person("Jerry", 123),
new Person("Tom", 234),
new Person("Jerry", 456),
new Person("Jake", 789)
);
List<Person> maxById = persons
.stream()
.collect(Collectors.groupingBy(
Person::getName,
Collectors.maxBy(Comparator.comparingInt(Person::getID))
))
.values() // Collection<Optional<Person>>
.stream() // Stream<Optional<Person>>
.map(opt -> opt.orElse(null))
.collect(Collectors.toList());
System.out.println(maxById);
Output:
[789: Jake, 234: Tom, 456: Jerry]
Update
is there a way to get a separate list of the Person object who were deleted because they were duplicates within this stream()?
It may be better to collect the grouped items in a list which should be converted then in some wrapper class providing information about the maxById
person and the list of deduped persons:
class PersonList {
private final Person max;
private final List<Person> deduped;
public PersonList(List<Person> group) {
this.max = Collections.max(group, Comparator.comparingInt(Person::getID));
this.deduped = new ArrayList<>(group);
this.deduped.removeIf(p -> p.getID() == max.getID());
}
@Override
public String toString() {
return "{max: " + max + "; deduped: " + deduped + "}";
}
}
Then the persons should be collected like this:
List<PersonList> maxByIdDetails = new ArrayList<>(persons
.stream()
.collect(Collectors.groupingBy(
Person::getName,
LinkedHashMap::new,
Collectors.collectingAndThen(
Collectors.toList(), PersonList::new
)
))
.values()); // Collection<PersonList>
maxByIdDetails.forEach(System.out::println);
Output:
{max: 456: Jerry; deduped: [123: Jerry]}
{max: 234: Tom; deduped: []}
{max: 789: Jake; deduped: []}
Update 2
Getting list of duplicated persons:
List<Person> duplicates = persons
.stream()
.collect(Collectors.groupingBy(Person::getName))
.values() // Collection<List<Person>>
.stream() // Stream<List<Person>>
.map(MyClass::removeMax)
.flatMap(List::stream) // Stream<Person>
.collect(Collectors.toList()); // List<Person>
System.out.println(duplicates);
Output:
[123: Jerry]
where removeMax
may be implemented like this:
private static List<Person> removeMax(List<Person> group) {
List<Person> dupes = new ArrayList<>();
Person max = null;
for (Person p : group) {
Person duped = null;
if (null == max) {
max = p;
} else if (p.getID() > max.getID()) {
duped = max;
max = p;
} else {
duped = p;
}
if (null != duped) {
dupes.add(duped);
}
}
return dupes;
}
Or, providing that hashCode
and equals
are implemented properly in class Person
, the difference between the two lists may be calculated using removeAll
:
List<Person> duplicates2 = new ArrayList<>(persons);
duplicates2.removeAll(maxById);
System.out.println(duplicates2);