Search code examples
phpxmlphpquery

Selecting peculiar XML tags with phpQuery


phpQuery is a really nice tool which has helped me tremendously in the past parse well-formed XHTML and XML documents, but I have recently run into a problem trying to select elements that have colons in their tagname, such as the following:

<isc:thumb><![CDATA[http://example.com/foo_thumb.jpg]]></isc:thumb>

I've tried to use the pq() function to select all of these elements:

foreach ( pq("isc:thumb") as $thumbnail ) {
  print pq( $thumbnail )->text();
}

Unfortunately this is doing nothing. If I try another element, like a tagname of id, the results pop up as expected.


Solution

  • You are trying to find a thumb element which belongs to the isc namespace (see XML namespace); not a tag named isc:thumb.

    phpQuery can happily query for namespaced elements, just not like you are trying to do. Instead, simply provide the tag in the form namespace|tagname (i.e. isc|thumb). It is also worth noting that the namespace will have to be registered with phpQuery's XPath handler (which is just a DOMXPath object) for it to be able to recognise the namespace.

    Here's a quick example with a sample XML document (obviously, use your own XML and be sure to provide the correct namespace URI).

    phpQuery::newDocumentXML('<root xmlns:isc="urn.example.isc">
      <isc:thumb><![CDATA[http://example.com/foo_thumb.jpg]]></isc:thumb>
      <isc:thumb><![CDATA[http://example.com/bar_thumb.jpg]]></isc:thumb>
    </root>
    ');
    phpQuery::getDocument()->xpath->registerNamespace('isc', 'urn.example.isc');
    foreach ( pq("isc|thumb") as $thumbnail ) {
        echo pq( $thumbnail )->text() . PHP_EOL;
    }
    

    Which outputs:

    http://example.com/foo_thumb.jpg
    http://example.com/bar_thumb.jpg