Search code examples
javacollectionsjava-5

Get all duplicate values from multiple list in Java


I want to get all values that has duplicate from multiple List of Integers. The confusing part is these list of Integers are inside a Map of Map like this LinkedHashMap<String, LinkedHashMap<String, List>> streams

// sample value
{
    break_desc100=
    {
        bDesc_1000=[62, 72, 82, 92, 102, 112, 122], 
        bDesc 1001=[180, 190, 200, 210, 220, 230, 240], 
        cMessage_1000=[112], 
        cMessage_1001=[232]
    }
}
// for this one I want to get 112

So far I tried using retainAll but my code doesn't work if the list with duplicate are not next to each other.

for (Map.Entry<String,LinkedHashMap<String,List<Integer>>> entry : streams.entrySet()) {
     String currentStream = entry.getKey();
     LinkedHashMap<String,List<Integer>> bDescList = entry.getValue();
     for (Map.Entry<String,List<Integer>> bDesc : bDescList.entrySet()) {
          if (firstIteration) {
              prevBDesc = bDesc;
              firstIteration = false;
          } else {
              List<Integer> currentList = prevBDesc.getValue();
              List<Integer> nextList = bDesc.getValue();
              duplicates = new ArrayList<Integer>(currentList);
              duplicates.retainAll(nextList);
              allDuplicates.addAll(duplicates); //Set<Integer>
              prevBDesc = bDesc;
          }
     }
}

EDIT: Sorry guys I forgot to add that it is running on Java 1.5.


Solution

  • This seems like a suitable task for streams:

    Map<Integer, Long> counts = streams.values().stream()
           .flatMap(bDescList -> bDescList.values().stream())
           .flatMap(nextList -> nextList.stream())
           .collect(Collectors.groupingBy(
                    Function.identity(), 
                    Collectors.counting()));
    
    counts.values().removeIf(c -> c == 1L);
    
    Set<Integer> duplicates = counts.keySet();
    

    This code first creates a map of counts. For this, it first streams the values of the outer map and then uses Stream.flatMap to create a new stream with the values of all the inner maps. As these values are actually lists, we need to use Stream.flatMap again, to finally get a stream of Integer. (I've kept the variable names from your question).

    We collect to a map of counts, where the keys are the numbers from all the inner maps' list values, and the values are the counts for each one of these numbers, accross all maps and lists.

    Then, we remove all entries from the counts map that have a value of 1. The remaining keys are the duplicate numbers.


    EDIT: Here's the equivalent code in Java 5... 😨😨😨

    Map<Integer, Long> counts = new HashMap<Integer, Long>();
    
    for (Map<String, List<Integer>> bDescList : streams.values()) {
        for (List<Integer> bDesc : bDescList.values()) {
            for (Integer n : bDesc) {
                Long c = counts.get(n);
                if (c == null) {
                    c = 0L;
                }
                counts.put(n, c + 1);
            }
        }
    }
    
    Iterator<Long> it = counts.values().iterator();
    while (it.hasNext()) {
        Long c = it.next();
        if (c == 1L) {
            it.remove();
        }
    }
    
    Set<Integer> duplicates = counts.keySet();
    

    The rationale is exactly the same here... We create a map of counts by iterating the map of maps of lists, then we remove entries with a count of 1 and the remaining keys are the duplicates.