Search code examples
pythonhtmlbeautifulsoupscreen-scraping

Strip Html Tags Findall + Beautiful Soup


Well I have done probably 2 hours of searching and I believe my brain is probably just fried. Today is my first day with BeautifulSoup (so please be gentle). The source code for the website that I am scraping has a format that is as follows:

<a href="/listing/view" class="price">$100</a>

I feel pretty dumb because I am getting the whole a tags when writing to a file and I have a sneaking suspicion that there is such a simple solution but I cannot seem to find it.

Currently I'm using the following:

soup = BeautifulSoup(page.content, 'html.parser')
prices = soup.find_all(class_="price")
passed.append(prices)

How can I target just the content with matching classes between specific tags?


Solution

  • prices = soup.find_all(class_="price")
    
    for a in prices:
      passed.append(int(a.text.strip().replace('$','')) # will append to the list
    

    This should help.