Search code examples
htmlparsingxpath

get XPATH for all the nodes


Is there a library that can give me the XPATH for all the nodes in an HTML page?


Solution

  • is there any library that can give me XPATH for all the nodes in HTML page

    Yes, if this HTML page is a well-formed XML document.

    Depending on what you understand by "node"...

    //*
    

    selects all the elements in the document.

    /descendant-or-self::node()
    

    selects all elements, text nodes, processing instructions, comment nodes, and the root node /.

    //text()
    

    selects all text nodes in the document.

    //comment()
    

    selects all comment nodes in the document.

    //processing-instruction()
    

    selects all processing instructions in the document.

    //@* 
    

    selects all attribute nodes in the document.

    //namespace::*
    

    selects all namespace nodes in the document.

    Finally, you can combine any of the above expressions using the union (|) operator.

    Thus, I believe that the following expression really selects "all the nodes" of any XML document:

    /descendant-or-self::node() | //@* | //namespace::*