Search code examples
pythonscreen-scraping

Check if Python has written the targeted text


I'm writing a code to scrape selected portions of visible text off a great number of web pages. Here's a part of it:

                divTag = soup.find_all("div", {'id':'articleBody'})
                for tag in divTag:
                    pTags = tag.find_all("p") 
                    for tag in pTags:
                        print >>f, tag.text

How can I check if Python has found and written the targeted text, and put the link aside (to a list) if the scraping wasn't a success?

I didn't find an answer here, and I don't know where to look in the documentation.


Solution

  • This is an alternative to know if python found the text you are looking for:

    import requests
    from bs4 import BeautifulSoup
    
    urls = ['https://www.google.com']
    for i in range(len(urls)):
        r = requests.get(urls[i])
        soup = BeautifulSoup(r.content, 'lxml')
        items = soup.find_all('p')
        for item in items:
            if "2016 - Privacidad - Condiciones" in item.text:
                print "Python has found the targeted text"
    

    If python doesn't find the text, you need to use remove() method.