Hy guys, I have ran into trouble. I need to parse timetable from html into java and display it in mobile friendly format. I am going to use jsoup for parsing the html code and I think I will use getElementByTag() to retrieve data. But I am stuck on the algorithm because the html code is all over the place and it looks difficult to be read by jsoup. If anyone has any idea what algorithm to use I would be really happy and you would make my day!
I haven't tested this but it should work
doc = org.jsoup.Jsoup.connect(url).get()
css_path = "body > table > tbody > tr:nth-child(6) > td > table:nth-child(2) > tbody"
tbody = doc.select(css_path).first()
tbody.select("tr") // iterate and process each row as you like
You will need to provide var types etc