I try iterate html nodes and getting information from this nodes.
This is html example:
<div class="less1">
<h4>Test name 1</h4>
<div>
<div id="email">test1@test.com</div>
<div id="email">test2@test.com</div>
<div id="email">test3@test.com</div>
</div>
</div>
<div class="less1">
<h4>Test name 2</h4>
<div>
<div id="email">test_name1@test.com</div>
<div id="email">test_name2@test.com</div>
<div id="email">test_name3@test.com</div>
</div>
</div>
<div class="less1">
<h4>Test name 3</h4>
<div>
<div id="email">test_name_3@test.com</div>
</div>
</div>
<div class="less1">
<h4>Test name 4</h4>
</div>
This is my code example.
final List<HtmlListItem> nodes = htmlPage.getByXPath("//*[@class=\"less1\"]");
for (HtmlListItem node: nodes) {
final List<?> divs = node.getByXPath("//h4/text()");
}
"divs" List size is always 4.
Is it possible get only 1 result from current node?
To get only the first matching element use getFirstByXPath
:
final List<?> divs = node.getFirstByXPath("//h4/text()");
If you need a specific element by index:
final Object div = node.getByXPath("//h4/text()").get(index);
UPDATE
Maybe the problem is the usage of an absolute xpath. Try to use a relative path on every node:
String text = node.getByXPath("h4/text()");
List<String> emails = node.getByXPath("div/div");
Otherwise you can extract data from every node exploring the child nodes
for (HtmlListItem node: nodes) {
NodeList children = node.getChildNodes();
for (int i = 0; i < children.getLength(); i++) {
Node child = children.item(i);
/** extract data from child **/
}
}