Search code examples
pythonpython-3.xbeautifulsoupscreen-scrapingpython-requests-html

How to find all Elements of a specific Type with the new Requests-HTML library


I wanna find all specific fields in a HTML, in Beautiful soup everything is working with this code:

soup = BeautifulSoup(html_text, 'html.parser')
urls_previous = soup.find_all('h2', {'class': 'b_algo'})

but how can I make the same search with the requests library or can requests only find a single element in a HTML document, I couldn't find how to do it in the docs or examples ?

https://html.python-requests.org/

Example:

<li class="b_algo"><h2><a href="https://de.wikipedia.org/wiki/Vereinigte_Staaten">Vereinigte Staaten – Wikipedia</a></h2><a href="https://de.wikipedia.org/wiki/Vereinigte_Staaten">https://de.wikipedia.org/wiki/Vereinigte_Staaten</a></div><p>U.S., I wanna have THIS text here</p></li>

How can I find all Elements of a specific type with the requests library ?


Solution

  • with requests-html

    from requests_html import HTML
    doc = """<li class="b_algo"><h2><a href="https://de.wikipedia.org/wiki/Vereinigte_Staaten">Vereinigte Staaten – Wikipedia</a></h2><a href="https://de.wikipedia.org/wiki/Vereinigte_Staaten">https://de.wikipedia.org/wiki/Vereinigte_Staaten</a></div><p>U.S., I wanna have THIS text here</p></li>"""
    #load html from string
    html = HTML(html=doc)
    x = html.find('h2')
    print(x)