Search code examples
pythondictionarysublist

Matching values in variable length lists containing sublists in python


I am trying to iterate through a dictionary where each key contains a list which in turn contains from 0 up to 20+ sub-lists. The goal is to iterate through the values of dictionary 1, check if they are in any of the sublists of dictionary 2 for the same key, and if so, add +1 to a counter and not consider that sublist again.

The code looks somewhat like this:

dict1={"key1":[[1,2],[6,7]],"key2":[[1,2,3,4,5,6,7,8,9]]}
dict2={"key1":[[0,1,2,3],[5,6,7,8],[11,13,15]],"key2":[[7,8,9,10,11],[16,17,18]]}

for (k,v), (k2,v2) in zip(dict1.iteritems(),dict2.iteritems()):
    temp_hold=[]
    span_overlap=0
    for x in v:
        if x in v2 and v2 not in temp_hold:
            span_overlap+=1
            temp_hold.append(v2)
        else:
            continue
    print temp_hold, span_overlap

This does obviously not work mainly due to the code not being able to check hierarchally through the list and sublists, and partly due to likely incorrect iteration syntax. I have not the greatest of grasp of nested loops and iterations which makes this a pain. Another option would be to first join the sublists into a single list using:

v=[y for x in v for y in x]

Which would make it easy to check if one value is in another dictionary, but then I lose the ability to work specifically with the sublist which contained parts of the values iterated through, nor can I count that sublist only once.

The desired output is a count of 2 for key1, and 1 for key2, as well as being able to handle the matching sublists for further analysis.


Solution

  • Here is one solution. I am first converting the list of lists into a list of sets. If you have any control over the lists, make them sets.

    def matching_sublists(dict1, dict2):
        result = dict()
        for k in dict1:
            assert(k in dict2)
            result[k] = 0
            A = [set(l) for l in dict1[k]]
            B = [set(l) for l in dict2[k]]
            for sublistA in A:
                result[k] += sum([1 for sublistB in B if not sublistA.isdisjoint(sublistB) ])
        return result
    
    
    if __name__=='__main__':
        dict1={"key1":[[1,2],[6,7]],"key2":[[1,2,3,4,5,6,7,8,9]]}
        dict2={"key1":[[0,1,2,3],[5,6,7,8],[11,13,15]],"key2":[[7,8,9,10,11],[16,17,18]]}
        print(matching_sublists(dict1, dict2))