I found that HtmlAgilityPack SelectSingleNode
always starts from the first node of the original DOM. Is there an equivalent method to set its starting node ?
Sample html
<html>
<body>
<a href="https://home.com">Home</a>
<div id="contentDiv">
<tr class="blueRow">
<td scope="row"><a href="https://iwantthis.com">target</a></td>
</tr>
</div>
</body>
</html>
Not working code
//Expected:iwantthis.com Actual:home.com,
string url = contentDiv.SelectSingleNode("//tr[@class='blueRow']")
.SelectSingleNode("//a") //What should this be ?
.GetAttributeValue("href", "");
I have to replace the code above with this:
var tds = contentDiv.SelectSingleNode("//tr[@class='blueRow']").Descendants("td");
string url = "";
foreach (HtmlNode td in tds)
{
if (td.Descendants("a").Any())
{
url= td.ChildNodes.First().GetAttributeValue("href", "");
}
}
I am using HtmlAgilityPack 1.7.4 on .Net Framework 4.6.2
The XPath you are using always starts at the root of the document. SelectSingleNode("//a")
means start at the root of the document and find the first a
anywhere in the document; that's why it grabs the Home link.
If you want to start from the current node, you should use the .
selector. SelectSingleNode(".//a")
would mean find the first a
that is anywhere beneath the current node.
So your code would look like this:
string url = contentDiv.SelectSingleNode(".//tr[@class='blueRow']")
.SelectSingleNode(".//a")
.GetAttributeValue("href", "");