Search code examples
pythondictionaryindexingnlp

Removing Duplicate Values From List (Python)


I'm try to make an inverted index for some NLP to see how many times a word appears in a document. I'm doing this via a dictionary but my output is like this (here the word man appears in documents 1 and 11)

{'man': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 1, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11],
 'upon': [1, 1, 1, 3, 3, 3, 1539, 1539, 1539]}

How do I get rid of these duplicate values so I just have

{'man': [1,11], 'upon': [1,3,1539]}

Solution

  • Just convert values to sets and then back to lists:

    my_dict = {k: list(set(v)) for k, v in my_dict.items()}