Search code examples
pythonhtmlplaintext

Convert html to plain text Python


Good morning, I am looking for a way to convert the html code to plain text, I leave an example

HTML

<div class="card-headline"><h3 class="card-title">

Texto Plano

&lt;div class=&quot;card-headline&quot;&gt;&lt;h3 class=&quot;card-title&quot;&gt;

Solution

  • BeautifulSoup is a scraping library, so it's probably not the best choice for doing HTML rendering. If it's not essential to use BeautifulSoup, you should take a look at html2text. For example:

    • import html2text
    • html = open("foobar.html").read()
    • print html2text.html2text(html)

    This outputs:

    Some text more text even more text

    • list item
    • yet another list item

    Some other text

    • list item
    • yet another list item