Search code examples
phpxpathdomdocument

How to select all nodes of a DOMDocument with a single DOMXpath expression?


What is the xpath expression to select all nodes of a document?

Given this example XML:

<div class="header"/>

I contains three nodes: <div> (element), class= (attribute) and "header" (text).

$doc = new DOMDocument;
$doc->loadXml('<div class="header"/>');
$xpath = new DOMXPath($doc);

I tried with //node():

$xpath->query('//node()');

which returns all element nodes only (I assume because of //). Is there a way to add other nodes like attributes and textnodes in attribute values?


Additional example:

I can obtain each node by using the DOMDocument API, e.g. to obtain the text node of the attribute value:

$doc = new DOMDocument;
$doc->loadXml('<div class="header"/>');
$class = $doc->documentElement->getAttributeNode('class');
echo $class->childNodes->item(0)->nodeName;

Which gives:

#text

How to obtain the superset of all nodes with one xpath expression, especially including that exemplary class attribute-node child text-node?


Solution

  • Use:

    //node() | //@* | //namespace::*
    

    this selects any node (of type document node /, element node, text node, processing instruction node and comment node) and any attribute node and any namespace node -- that is all nodes because there are no other types of nodes.

    How you access the obtained XmlNodeList containing the selected nodes depends on the API of the specific XPath engine you are using -- read and use your documentation.

    XSLT- based example:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>
    
     <xsl:template match="/">
    
      <xsl:for-each select=
       "//node() | //@* | //namespace::*">
    
       Type: <xsl:text/>
    
       <xsl:choose>
        <xsl:when test="not(..)">
         <xsl:text>document node </xsl:text>
        </xsl:when>
        <xsl:when test="self::*">
         <xsl:text>element </xsl:text>
        </xsl:when>
        <xsl:when test="self::text()">
         <xsl:text>text-node </xsl:text>
        </xsl:when>
        <xsl:when test="self::comment()">
         <xsl:text>comment-node </xsl:text>
        </xsl:when>
        <xsl:when test="self::processing-instruction()">
         <xsl:text>PI-node </xsl:text>
        </xsl:when>
        <xsl:when test="count(.|../@*) = count(../@*)">
         <xsl:text>attribute-node </xsl:text>
        </xsl:when>
        <xsl:when test=
        "count(.|../namespace::*) = count(../namespace::*)">
         <xsl:text>namespace-node </xsl:text>
        </xsl:when>
       </xsl:choose>
    
       <xsl:text>Name: "</xsl:text>
       <xsl:value-of select="name()"/>" <xsl:text/>
    
       <xsl:text>Value: </xsl:text>
       <xsl:value-of select="."/>
    
      </xsl:for-each>
    
     </xsl:template>
    </xsl:stylesheet>
    

    when this XSLT transformation is applied on any XML document it selects all nodes using the above XPath expression (the transformation intentionally excludes any white-space-only text nodes) and outputs (in document order) the type, name and string-value of the selected nodes.

    For example, when applied on this XML document:

    <networkOfBridges xmlns:x="x">
        <bridge id="1"  otherside="A" />
        <!-- A Comment -->
        <bridge id="2"  oneside="A"/>
        <?PI Processing Instruction ?>
        <bridge id="3"  oneside="A" otherside="A" />
    </networkOfBridges>
    

    the result is:

       Type: element Name: "networkOfBridges" Value: 
    
       Type: namespace-node Name: "xml" Value: http://www.w3.org/XML/1998/namespace
    
       Type: namespace-node Name: "x" Value: x
    
       Type: element Name: "bridge" Value: 
    
       Type: namespace-node Name: "xml" Value: http://www.w3.org/XML/1998/namespace
    
       Type: namespace-node Name: "x" Value: x
    
       Type: attribute-node Name: "id" Value: 1
    
       Type: attribute-node Name: "otherside" Value: A
    
       Type: comment-node Name: "" Value:  A Comment 
    
       Type: element Name: "bridge" Value: 
    
       Type: namespace-node Name: "xml" Value: http://www.w3.org/XML/1998/namespace
    
       Type: namespace-node Name: "x" Value: x
    
       Type: attribute-node Name: "id" Value: 2
    
       Type: attribute-node Name: "oneside" Value: A
    
       Type: PI-node Name: "PI" Value: Processing Instruction 
    
       Type: element Name: "bridge" Value: 
    
       Type: namespace-node Name: "xml" Value: http://www.w3.org/XML/1998/namespace
    
       Type: namespace-node Name: "x" Value: x
    
       Type: attribute-node Name: "id" Value: 3
    
       Type: attribute-node Name: "oneside" Value: A
    
       Type: attribute-node Name: "otherside" Value: A