Search code examples
pythonstringlistpython-re

how to remove duplicate words from a list


there is a list, it can contain such words:

list = ["word","textword","randomword"]

at the same time, this repeated word can be both in another word, and in the middle of the list or at the end

import re
list = ["word","textword","randomword"]
lst = []
for i in list:
    control_word = str(i)
    list.remove(control_word)
    for text in list:
        result = re.sub(fr"[^{control_word}]", '', text)
        lst.append(result)
print(lst)

output: ['word', 'rdoword', 'word']

i need output: ["word","text","random"]

I thought that you can check the desired word through in, and then translate the repeated word into a regular expression pattern and remove it from the string, but so far it does not work.


Solution

  • Here is a simple double loop that seems to work with the dataset you gave. Even if the control word isn't at index 0 this will work.

    list = ["word","textword","randomword"]
    
    for i in range(len(list)):
        for k in range(len(list)):
            if list[i] in list[k] and list[i] != list[k]:
                list[k] = list[k].replace(list[i],"")
    
    print(list)
    

    Output: ['word', 'text', 'random']