Search code examples
phpxpathdomxpath

xPath with optional child element (which should be returned if it exists)


I am using the xPath functions of PHP's DOMDocument.

Let's say, I have the HTML below (to illustrate my problem):

<span class="price">$5.000,00</span>
<span class="newPrice">$4.000,00</span>

The first line is always available, but in some cases the 'newPrice-class' is in the HTML.

I used this xPath-expression, but that one always returns the 'price-class', even when the other is present. When the 'newPrice'-class is present, I only want that value. If it is not present, then I want the 'price'-class value.

//span[@class='price'] | //[span[@class='newPrice']

How can I achieve this? Any ideas?


Solution

  • It perhaps helps to formulate the condition differently:

    You want to select the <span> element with class="price" only if there is none with class="newPrice". Otherwise you want the one with class="newPrice".

    //span[(not(//span[@class="newPrice"]) and @class="price") or @class="newPrice"]
    

    This Xpath expression will return the element you're looking for.

    An Explanation: The first condition can be written as the following in a predicate:

    not(//span[@class="newPrice"]) and @class="price"
    

    The second condition is like you had it already:

    @class="newPrice"
    

    With the correct parenthesis you can combine this with the or operator:

    //span[
      (
         not(//span[@class="newPrice"]) 
         and @class="price"
      ) 
      or 
      @class="newPrice"
    ]
    

    And as you want to obtain the price values as string, this is how it looks in a PHP example code:

    $doc = new DOMDocument();
    $doc->loadHTML($html);
    $xpath = new DOMXPath($doc);
    
    $expression = 'string(//span[(not(//span[@class="newPrice"]) and @class="price") or @class="newPrice"])';
    
    echo "your price: ", $xpath->evaluate($expression), "\n";
    

    Output:

    your price: $4.000,00