I have html code like below
<html>
<body>
<div id="1">
<table>
<tr>
<td>ID</td>
<td>:</td>
<td>123</td>
</tr>
<tr>
<td>Status</td>
<td>:</td>
<td>Fail</td>
</tr>
</table>
</div>
<div id="2">
<table>
<tr>
<td>ID</td>
<td>:</td>
<td>456</td>
</tr>
<tr>
<td>Status</td>
<td>:</td>
<td>Success</td>
</tr>
</table>
</div>
<div id="3">
<table>
<tr>
<td>ID</td>
<td>:</td>
<td>789</td>
</tr>
<tr>
<td>Status</td>
<td>:</td>
<td>Fail</td>
</tr>
</table>
</div>
<div id="4">
<table>
<tr>
<td>ID</td>
<td>:</td>
<td>135</td>
</tr>
<tr>
<td>Status</td>
<td>:</td>
<td>Success</td>
</tr>
</table>
</div>
</body>
</html>
I need to parse this HTML code. I need to iterate through all div tags present and Search for "Search" in the td's in every div iteratively. If present get its 2nd adjacend td value i.e., Fail / Success. if If is "Fail" then I need to again search for "ID" and if present I need to print its 2nd adjacent div value i.e., 123 and 789 in this case.
Pseudo code might look like below
if(code contains "Status")
{
1. Get its 2nd td value i.e., Fail/Success
if(td value is "Fail")
{
1. Search for "ID"
if("ID" present)
{
Print the number/2nd adjacent <td> value
}
}
}
I had tried this in javascript something like below
var t0=$(this).find('tr:has(td:contains("Test Status"))');
if (t0.length)
{
var str0 =t0.text().trim();
str0 = /:(.+)/.exec(str0)[1];
if(str0 == "FAIL")
{
var t1=$(this).find('tr:has(td:contains("Test ID"))');
if (t1.length)
{
str =t1.text().trim();
str = /:(.+)/.exec(str)[1];
testIDArray.push(str);
// alert(str);
}
}
But I need to do it in java using jsoup. I tried somethinng like below
String htmlString = fileContent;
Document document = Jsoup.parse(htmlString);
Elements elements = document.body().select("div"); for (Element element : elements) { String link = element.select("td:contains(Test Status)").attr("<tr>");
if(link != null || !(link.isEmpty()))
{
System.out.println(link);
System.out.println("=========================");
}
}
Kindly help me with this. I don't know how to proceed.
Thanks in advance.
Kindly help me with this.
You can use Java Streams to solve this:
List<String> failedIds = document.body().select("div table").stream()
.map(e -> e.select("tr"))
.filter(trs -> "FAIL".equalsIgnoreCase(trs.last().select("td").last().text()))
.map(trs -> trs.first().select("td").last().text())
.collect(Collectors.toList());
The result will be:
[123, 789]
First you select div table
to get all the elements. Then you select all tr
s and filter those which have Status Fail
(trs -> trs.first().select("td").last().text()
). At the end you map the ID (trs -> trs.first().select("td").last().text()
).
To print the ids instead of creating a List you can use .forEach()
:
document.body().select("div table").stream()
.map(e -> e.select("tr"))
.filter(trs -> "FAIL".equalsIgnoreCase(trs.last().select("td").last().text()))
.map(trs -> trs.first().select("td").last().text())
.forEach(System.out::println);
Alternatively you can use this (without Streams):
for (Element e : document.body().select("div table")) {
Elements trs = e.select("tr");
if ("FAIL".equalsIgnoreCase(trs.last().select("td").last().text())) {
String id = trs.first().select("td").last().text();
System.out.println(id);
}
}