Search code examples
pythonlistinputstorewords

How do I find all unique words(no duplicates)?


I would like to find all the unique words that are in both files. I am able to list all the words from each file but it gives me duplicates. I also would like to sort them by alphabetical order. How do I go about doing this?

#!/usr/bin/python3

#First file
file = raw_input("Please enter the name of the first file: ")

store = open(file)

new = store.read()

#Second file
file2 = raw_input("Please enter the name of the second file: ")

store2 = open(file2)

new2 = store2.read()

for line in new.split():
    if line in new2:
            print line

Solution

  • Here is a snippet which might help you:

    new = 'this is a bunch of words'
    new2 = 'this is another bunch of words'
    
    unique_words = set(new.split())
    unique_words.update(new2.split())
    sorted_unique_words = sorted(list(unique_words))
    print('\n'.join(sorted_unique_words))
    

    Update:

    If you're only interested in words that are common to both files, do this instead:

    unique_words = set(new.split())
    unique_words2 = set(new2.split())
    common_words = set.intersection(unique_words, unique_words2)
    print('\n'.join(sorted(common_words)))