Search code examples
pythonhtmlbeautifulsoupicalendar

How to create ical files from HTML table


I'm trying to create importable calender events from a website. The website has the events clustered into a standard html table.

I was wondering if beautfulsoup is the correct way to takel this problem, because i only get the first entry and then nothing.

quote_page = "http://www.ellen-hartmann.de/babybasare.html"

page = urllib2.urlopen(quote_page)

soup = BeautifulSoup(page, "html.parser")

table = soup.find("table", {"border": "1"})

td = table.find("td", text="Veranstaltungstyp ")

print table

td_next = table.find_next("tr")

print td_next

Solution

  • I think you're stopping because your using find() which gets one matching tag, instead of find_all() which gets all the matching tags. Then you have to loop over the results

    import requests
    from bs4 import BeautifulSoup
    
    response = requests.get("http://www.ellen-hartmann.de/babybasare.html")
    
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # now let's find every row in every table
    for row in soup.find_all("tr"):
    
        # grab the cells within the row
        cells = row.find_all("td")
    
        # print the value of the cells as a list.  This is the point where
        # you will need to filter the rows to figure out what is an event (and
        # what is not), determine the start date and time, and convert the values 
        # to iCal format.
        print([c.text for c in cells])