Search code examples
pythondictionarydefaultdict

Remove python dictionary enteries for keys with values that are a subset of another key


I have a dictionary generated using defaultdict:

{"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
 "GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"],
 "GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}

One of the enteries is a subset of the other in terms of its values:

"GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"]

is a subset of

"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"]

How would I go about collapsing the dictionary so that in the end I would get either of these results?

{"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
 "GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}

or

{["GGGAAATTTCCCTTTGGGAAACGG", "GGGAAATTTCCCTTTGGGAAAGCC"]:
    ["9/1", "9/2", "1/1.1", "9/2.1"],
 "GGGAAATTTCCCTTTGGGAAAGGG":
    ["1/1", "1/2", "9/1", "1/1.1"]}

Edit:

So as requested this was my attempt:

#dd is my defaultdict
for keys, values in dd.iteritems():
        if all(for item in values:
                if item in dd.items():
                    return True
                else:
                    return False):
             print keys

Solution

  • You can try this

    mydict = {"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
     "GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"],
     "GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}
    
    >>>dict([i for i in mydict.items() if not any(set(j).issuperset(set(i[1])) and j!=i[1] for j in mydict.values())])
    
    {'GGGAAATTTCCCTTTGGGAAACGG': ['9/1', '9/2', '1/1.1', '9/2.1'],
     'GGGAAATTTCCCTTTGGGAAAGGG': ['1/1', '1/2', '9/1', '1/1.1']}
    

    OR simply

    for i in mydict.items():
        for j in mydict.values():
            if i[1]!=j:
                if set(j).issuperset(set(i[1])):
                    mydict.pop(i[0])
    
    >>>mydict
    {'GGGAAATTTCCCTTTGGGAAACGG': ['9/1', '9/2', '1/1.1', '9/2.1'],
     'GGGAAATTTCCCTTTGGGAAAGGG': ['1/1', '1/2', '9/1', '1/1.1']}