I have two list, one is contain "category", the other is contain some more detail, such as:
List<String> categoryList = new ArrayList<>(3);
categoryList.add("cat");
categoryList.add("dog");
categoryList.add("bull");
categoryList.add("other");
List<String> detailList = new ArrayList<>();
detailList.add("cat a");
detailList.add("cat b");
detailList.add("dog a");
detailList.add("dog b");
detailList.add("dog c");
detailList.add("bull a");
detailList.add("bird a");
detailList.add("bird b");
Map<String, List<String>> map = new HashMap<>();
for (String category : categoryList) {
map.put(category,new ArrayList<>());
}
boolean isFind = false;
for (String detail : detailList) {
isFind = false;
for (String category : categoryList) {
if (StrUtil.containsIgnoreCase(detail, category)) {
map.get(category).add(detail);
isFind = true;
break;
}
}
if (!isFind) {
map.get("other").add(detail);
}
}
System.out.println(map);
The output is : {other=[bird a, bird b], cat=[cat a, cat b], dog=[dog a, dog b, dog c], bull=[bull a]}
I use the loop ,but i wonder if there are some advanced way to do it? thanks.
There are multiple ways to achieve this
Collectors#groupingBy
and List#contains
Assuming your details
and categories
always have the same structure
Map<String, List<String>> result = detailList.stream()
.collect(Collectors.groupingBy(detail -> {
String category = detail.substring(0, detail.indexOf(' '));
return categoryList.contains(category) ? category : "other";
}));
However, List#contains
performs poorely if the amount of data is high because it has a time complexity of O(n)
. So I would advise to go for the next one
Collectors#groupingBy
and Set#contains
HashSet<String> categories = new HashSet<>(categoryList);
Map<String, List<String>> result = detailList.stream()
.collect(Collectors.groupingBy(detail -> {
String category = detail.substring(0, detail.indexOf(' '));
return categories.contains(category) ? category : "other";
}));
If you prefer looping however, instead of using the Stream
API, you would want to look for something a bit more performant than what you have there.
Also, something a bit more robust.
Here are two problems I notice with your current code
O(m x n)
while it could be O(m + n)
assuming both lists are sorted alphabeticallyString#contains
to identify whether a detail
should be in one or the other category, possibly ending up having catepillars
inside of the cat
category