Search code examples
pythonhtmlbeautifulsouppython-requests

Printing a webpages raw html data unformatted with tags and similar information


import requests
from bs4 import BeautifulSoup


url = 'https://www.somewebpage.com'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
print(soup.prettify())

When I run this, my terminal in codewars appears like the literal webpage associated with the url, hyperlinks and all.

When I print the data I want to be able to see all of it without it being formatted, with the tags like and classes and other information to appear, similar to the way it appears when using the inspect element on a pc browser.

For context I don't know much about html and I'm running this on an iPhone without access to a PC.


Solution

  • Your question is not entirely clear, but I could imagine that the HTML is rendered directly by the codewars platform in the preview for the result.

    One approach to display HTML as text is to escape its specifics and there is no need to use beautifulsoup to extract the HTML, just use the text of response you get from your request:

    import html
    import requests
    
    url = 'https://docs.python.org/3/library/html.html#html.escape'
    print(html.escape(requests.get(url).text))