Search code examples
pythonweb-scrapingbeautifulsoupfindall

Prblem with find_all in BeautifulSoup4


I want to get information on the following website. I need book titles, codes, prices, etc. For instance, let's concentrate on ISBN codes. I want to find in the html any piece of text that has the "ISBN" word.

My code is the following:

url_0 = 'https://www.boekenprijs.be/uitgebreid-zoeken?zoek=&veld=all&gbpstartdatumvan=&gbpstartdatumtotenmet=&gbpeinddatumvan=01/04/2024&gbpeinddatumtotenmet=12/08/2024&_token=FAoSCCoUK-SPrL-ktj4MtsVBv3L4K-FaH3jxSo259D0&page=1'

result = requests.get(url)

doc = BeautifulSoup(result.text, "html.parser")

aux = doc.find_all(string="ISBN")

My problem here is that my outcome aux is empty, I cannot find anything with ISBN, but looking at the html I do see this word.


Solution

  • This may not be the best way, But it might be an alternative I'am using lambda to filter tag div where "ISBN" text inside

    import requests
    from bs4 import BeautifulSoup
    
    url = 'https://www.boekenprijs.be/uitgebreid-zoeken?zoek=&veld=all&gbpstartdatumvan=&gbpstartdatumtotenmet=&gbpeinddatumvan=01/04/2024&gbpeinddatumtotenmet=12/08/2024&_token=FAoSCCoUK-SPrL-ktj4MtsVBv3L4K-FaH3jxSo259D0&page=1'
    result = requests.get(url)
    doc = BeautifulSoup(result.text, "html.parser")
    # Find elements ISBN
    aux = doc.find_all(lambda tag: tag.name == 'div' and "ISBN" in tag.text)
    for element in aux:
        print(element.text)