Search code examples
pythonbeautifulsouphtml-parsing

How to parse BeautifulSoup results for a specific word to define a boolean?


I am trying to take BeautifulSoup results and parse them for a specific word which I will define whether a certain value is True or False. For example, if I parse with BeautifulSoup for a specific id element and it contains the word "yes", then bool1 = True. If the specific id element contains the word "no", than bool1 = false.

This is what I have so far:

from bs4 import BeautifulSoup, SoupStrainer
import requests

parse_only = SoupStrainer('h1')
page1 = requests.get('http://www.play-hookey.com/htmltest/')
soup = BeautifulSoup(page1.content, 'html.parser', parse_only=parse_only)

results1 = soup.find_all('h1')

print(results1)

I am trying to then parse results1 for a specific word and if it contains that word, then the boolean will either be True or False.


Solution

  • You can search if the word you want is in the .text() of the results1 object:

    import requests
    from bs4 import BeautifulSoup
    
    URL = "http://www.play-hookey.com/htmltest/"
    soup = BeautifulSoup(requests.get(URL).content, "html.parser")
    
    results1 = soup.find_all("h1")
    
    # This will return True if found a match else False
    print(any("WORD I'M LOOKING FOR" in tag.text.split() for tag in results1))