I've this HTML code:
<td class="topic starter"><a href="http://www.test.com">Title</a></td>
I want to extract "Title" and the URL, so I did this:
Elements titleUrl = doc.getElementsByAttributeValue("class", "topic starter");
String title = titleUrl.text();
And this works for the title, but for the URL I tried the following:
String url = titleUrl.html();
String url = titleUrl.attr("a [href]");
String url = titleUrl.attr("a[href]");
String url = titleUrl.attr("href");
String url = titleUrl.attr("a");
But no one works and I'm not able to get the URL.
Try this:
Element link = doc.select("td.topic.starter > a").first();
String url = link.attr("href");
You first select the a
element and then extract its attribute href
.