Search code examples
c#xpathhtml-agility-pack

How to get last parent node with ancestors xpath in html agility pack


How to get last parent node with ancestors XPath in HTML document in HTML Agility Pack (HAP)? For example, I have one HTML document please check below:

<html>
   <body>
      <div>
         <div>
            <div>
               <a>
                  <h3>
                  </h3>
               </a>
            </div>
         </div>
      </div>
   </body>
</html>

I need to get the last parent node and their ancestors path in HAP. For example, the XPath of the above HTML document is

/html/body/div/div[1]/div[2]/a/h3

Expect xpath will be

/html/body/div/div[1]/div[2]

Note that I need to get the expected Xpath dynamically - not as a manually hardcode value. For example, based on the last element I need to get the parent with ancestors path.


Solution

  • Luckily, Html-Agility-Pack comes with an XPath property and some methods to get exactly what you want.

    So, somehow select a HtmlNode, move to the parent node and retrieve the ancestors XPath (from the first/last via Linq) like this:

    htmlNode.ParentNode.Ancestors().FirstOrDefault().XPath
    

    to get the first ancestor, the last works like this

    htmlNode.ParentNode.Ancestors().LastOrDefault().XPath
    

    or iterate over the Ancestors.