Search code examples
javahtmlparsingjsouptimetable

Parsing html timetable code into java


Hy guys, I have ran into trouble. I need to parse timetable from html into java and display it in mobile friendly format. I am going to use jsoup for parsing the html code and I think I will use getElementByTag() to retrieve data. But I am stuck on the algorithm because the html code is all over the place and it looks difficult to be read by jsoup. If anyone has any idea what algorithm to use I would be really happy and you would make my day!

The link to the timetable

and also it may look like this


Solution

  • I haven't tested this but it should work

    doc = org.jsoup.Jsoup.connect(url).get()
    css_path = "body > table > tbody > tr:nth-child(6) > td > table:nth-child(2) > tbody"
    tbody = doc.select(css_path).first()
    tbody.select("tr") // iterate and process each row as you like
    

    You will need to provide var types etc