Search code examples
pythonpython-3.xdictionaryiteritems

Keep first occurrence in duplicated values in dictionary


Having duplicated values in dictionary such as the following:

dict_numbers={'one':['one', 'first','uno', 'une'],
            'zero':['zero','nothing','cero'],
            'first':['one', 'first','uno', 'une'],
            'dos':['two','second','dos', 'segundo','deux'],
            'three':['three','tres','third','tercero'],
            'second':['two','second','dos','segundo','deux'],
            'forth':['four','forth', 'cuatro','cuarto'],
            'two': ['two','second','dos', 'segundo','deux'],
            'segundo':['two','second','dos', 'segundo','deux']}

I'd like to get the first occurrences of keys that have duplicated values. Notice, that the dictionary does not have duplicated keys, but duplicated values. In this example, I would get a list keeping the first occurrence of duplicated values:

list_numbers_no_duplicates=['one','zero','dos','three','forth']

first key is removed because one has already the same values. second key is removed because dos has already the same values. two key is removed because dos has already the same values.

How to keep track of the several duplicates in the values of keys?

Thanks in advance


Solution

  • Hopefully I understood correctly your goal. The following uses chain from the ever-so-useful itertools package.

        >>> {key: vals for i, (key, vals) in enumerate(dict_numbers.items()) 
            if key not in chain(*list(dict_numbers.values())[:i])}
        {'one': ['one', 'first', 'uno', 'une'], 
        'zero': ['zero', 'nothing', 'cero'], 
        'dos': ['two', 'second', 'dos', 'segundo', 'deux'], 
        'three': ['three', 'tres', 'third', 'tercero'], 
        'forth': ['four', 'forth', 'cuatro', 'cuarto']}
    

    Essentially, this works by recreating the original dictionary for entries where there are no occurrence where the key is found in any of the preceding lists (hence the enumerate and slicing shenanigans).