Search code examples
htmlpython-3.xhtml-parsing

HTML Specific <h1> Text in Python


I want to get only title of the page <h1>This is Title</h1> in python.

I tried some method but couldn't get desired result.

import requests

from bs4 import BeautifulSoup


response = requests.get("https://www.strawpoll.me/20321563/r")

html_content = response.content

soup = BeautifulSoup(html_content, "html.parser")

for i in soup.get_text("p", {"class": "result-list"}):
    print(i)

Solution

  • Use lxml for such tasks. You could use beautifulsoup as well.

    import lxml.html
    t = lxml.html.parse(url)
    print t.find(".//title").text
    

    (This is from How can I retrieve the page title of a webpage using Python? by Peter Hoffmann)