Search code examples
groovyjenkins-groovy

Efficient way to find duplicate items among lists in Groovy


I have groovy map built in following manner :

list1 = [ "val1" "val2" "val3" ]
list2 = [ "val7" "val8" ]
list3 = [ "val4" "val5" "val2" ]
list4 = [ "val6" "val4" "val3" ]
map["key1"] = list1
map["key2"] = list2
map["key3"] = list3
map["key4"] = list4

I need to iterate through map and compare each list with another lists in the map and if any of the item from a list matches with item with any of other list then throw an error for e.g. in above case,as list1's value ( val2 ) matches with list3, so it should throw an error and stop processing further in the map.

I can intersect 2 lists to find the duplicates but for this I may need to iterate over the map twice, for e.g. for each key , fetch the values and then iterate again the map to fetch lists for other keys and intersect one by one from first list to find the duplicates OR something similar along this direction but that would not be an efficient way to achieve this.

Can this be achieved in more efficient way in groovy?


Solution

  • How about something like the below using disjoint

    def list1 = [ "val1", "val2" ,"val3" ]
    def list2 = [ "val7", "val8" ]
    def list3 = [ "val4", "val5", "val24" ]
    def list4 = [ "val6", "val4", "val3" ]
    
    def map = [:]
    
    map["key1"] = list1
    map["key2"] = list2
    map["key3"] = list3
    map["key4"] = list4
    
    def merged = []
    
    map.each { key, val ->
        println "KEY: $key VAL: $val"
        if(!merged.disjoint(val)) {
             throw new Exception("Duplicates!!!")
        }
        merged.addAll(val) 
    }
    

    Here the time complexity of disjoint is O(n). Also, it's worth noting that between two lists a and b if a.size() < b.size() the time complexity will be O(a).