Search code examples
htmlpython-3.xbeautifulsouphtml-parserpython-beautifultable

How to parse HTML tbody data into Python in tabular format


I am new to python and I am trying to parse this data into tabular format in Python. I have considered examples but unable to get desired result.

Can someone please help me on this

<tbody>
<tr><td>Kupon in %</td><td>36,520</td></tr>
<tr><td>Erstes Kupondatum</td><td>03.07.2017</td></tr>
<tr><td>Letztes Kupondatum</td><td>03.04.2022</td></tr>
<tr><td>Zahlweise Kupon</td><td>Zinszahlung normal</td></tr>
<tr><td>Spezialkupon Typ</td><td>Zinssatz variabel</td></tr>

Need this data in this way :

Kupon in % 36,520 Erstes Kupondatum 03.07.2017 Letztes Kupondatum 03.04.2022


Solution

  • You can do that in two ways 1. Using list comprehension and 2. using for loop both produce the same result its on you to choose.

    from bs4 import BeautifulSoup
    
    html = """<tbody>
    <tr><td>Kupon in %</td><td>36,520</td></tr>
    <tr><td>Erstes Kupondatum</td><td>03.07.2017</td></tr>
    <tr><td>Letztes Kupondatum</td><td>03.04.2022</td></tr>
    <tr><td>Zahlweise Kupon</td><td>Zinszahlung normal</td></tr>
    <tr><td>Spezialkupon Typ</td><td>Zinssatz variabel</td></tr>"""
    
    #1
    soup = BeautifulSoup(html,'lxml')
    print(' '.join([td.text for td in soup.find_all('td')]))
    
    # 2 
    tags = []
    tr = soup.find_all('td')
    for td in tr:
      tags.append(td.text)
    
    print(' '.join(tags))
    

    Output: Kupon in % 36,520 Erstes Kupondatum 03.07.2017 Letztes Kupondatum 03.04.2022 Zahlweise Kupon Zinszahlung normal Spezialkupon Typ Zinssatz variabel