Search code examples
pythonstringcontainsstartswith

How to clean some some strings in the list?


I am trying to remove some strings from a list when the string starts with or contains "@", "#", "http" or "rt". A sample list is below.

text_words1 = ['@football', 'haberci', '#sorumlubenim', 'dedigin', 'tarafsiz', 'olurrt', '@football', 'saysaniz', 'olur', '#sorumlubenim', 'korkakligin', 'sonu']

According to above list, I want to remove '@football' and '#sorumlubenim'. I tried the code below.

 i = 0
 while i < len(text_words1):
     if text_words1[i].startswith('@'):
         del text_words1[i] 
     if text_words1[i].startswith('#'):
         del text_words1[i] 
     i = i+1
 print 'The updated list is: \n', text_words1  

However, the code above only removed some strings, not all of the ones which start with "@" or "#" symbols.

Then, I added the code below into what is above as not all strings of interest start with "@", "#" or "http", but contains those symbols.

 while i < len(text_words1):
     if text_words1[i].__contains__('@'):
         del text_words1[i] 
     if text_words1[i].__contains__('#'):
         del text_words1[i]
     if text_words1[i].__contains__('http'):
        del text_words1[i]
     i = i+1
 print 'The updated list: \n', text_words1  

The above code removed some items which contains "#: or "@" but not all.

Can someone advise me how to remove all items which starts with or contain "@", "#", "http", or "rt"?


Solution

  • As the comments point out. With your approach you lose reference of the lists' indexes therefore not iterating the whole list. You can use a list comprehension to remove the words you don't need

    new_list  = [i for i in text_words1 if not i.startswith(('@','#'))]