Search code examples
phphtmlxmlxpath

Selecting a css class with xpath


I want to select just a class on its own called .date

For some reason, I cannot get this to work. If anyone knows what is wrong with my code, it would be much appreciated.

@$doc = new DOMDocument();
@$doc->loadHTML($html);
$xml = simplexml_import_dom($doc); // just to make xpath more simple
$images = $xml->xpath('//[@class="date"]');                             
foreach ($images as $img)
{
    echo  $img." ";
}

Solution

  • I want to write the canonical answer to this question because the answer above has a problem.

    Our problem

    The CSS selector:

    .foo
    

    will select any element that has the class foo.

    How do you do this in XPath?

    Although XPath is more powerful than CSS, XPath doesn't have a native equivalent of a CSS class selector. However, there is a solution.

    The right way to do it

    The equivalent selector in XPath is:

    //*[contains(concat(" ", normalize-space(@class), " "), " foo ")]
    

    The function normalize-space strips leading and trailing whitespace (and also replaces sequences of whitespace characters by a single space).

    (In a more general sense) this is also the equivalent of the CSS selector:

    *[class~="foo"]
    

    which will match any element whose class attribute value is a list of whitespace-separated values, one of which is exactly equal to foo.

    A couple of obvious, but wrong ways to do it

    The XPath selector:

    //*[@class="foo"]
    

    doesn't work! because it won't match an element that has more than one class, for example

    <div class="foo bar">
    

    It also won't match if there is any extra whitespace around the class name:

    <div class="  foo ">
    

    The 'improved' XPath selector

    //*[contains(@class, "foo")]
    

    doesn't work either! because it wrongly matches elements with the class foobar, for example

    <div class="foobar">
    

    Credit goes to this fella, who was the earliest published solution to this problem that I found on the web: http://dubinko.info/blog/2007/10/01/simple-parsing-of-space-seprated-attributes-in-xpathxslt/