Search code examples
pythonpython-3.xfor-loopif-statementany

checking if any of multiple substrings is contained in a string - Python


I have a black list that contains banned substrings: I need to make an if statement that checks if ANY of the banned substrings are contained in given url. If it doesn't contain any of them, I want it to do A (and do it only once if any banned is present, not for each banned substring). If url contains one of the banned substrings I want it to do B.

black_list = ['linkedin.com', 'yellowpages.com', 'facebook.com', 'bizapedia.com', 'manta.com',
              'yelp.com', 'nextdoor.com', 'industrynet.com', 'twitter.com', 'zoominfo.com', 
              'google.com', 'yellow-listings.com', 'kompass.com', 'dnb.com', 'tripadvisor.com']

here are just two simple examples of urls that I'm using to check if it works. Url1 have banned substring inside, while url2 doesn't.

url1 = 'https://www.dnb.com/'
url2 = 'https://www.ok/'

I tried the code below that works but was wandering if there is better way (more computationally efficient) of doing it? I have a data frame of 100k+ urls so worried that this will be super slow.

mask = []
for banned in black_list:
    if banned in url:
        mask.append(True)
    else:
        mask.append(False)

if any(mask):
    print("there is a banned substring inside")
else:
    print("no banned substrings inside")      

Does anybody knows more efficient way of doing this?


Solution

  • Here is a possible one-line solution:

    print('there is a banned substring inside'
          if any(banned_str in url for banned_str in black_list)
          else 'no banned substrings inside')
    

    If you prefer a less pythonic approach:

    if any(banned_str in url for banned_str in black_list):
        print('there is a banned substring inside')
    else:
        print('no banned substrings inside')