Search code examples
xmlxpathxpath-1.0

Finding minimum value using XPath 1.0 does not work


I am trying to find the minimum value in a certain element from an XML document (it's actually a HTML table that is translated to XML). However, this does not work as intended.

The query is similar to the one used in How can I use XPath to find the minimum value of an attribute in a set of elements?. It looks like this:

/table[@id="search-result-0"]/tbody/tr[
    not(substring-before(td[1], " ") > substring-before(../tr/td[1], " "))
]

Executed on the example XML

<table class="tablesorter" id="search-result-0">
    <thead>
        <tr>
            <th class="header headerSortDown">Preis</th>
            <th class="header headerSortDown">Zustand</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td width="45px">15 CHF</td>
            <td width="175px">Ausgepack und doch nie gebraucht</td>
        </tr>
        <tr>
            <td width="45px">20 CHF</td>
            <td width="175px">Ausgepack und doch nie gebraucht</td>
        </tr>
        <tr>
            <td width="45px">25 CHF</td>
            <td width="175px">Ausgepack und doch nie gebraucht</td>
        </tr>
        <tr>
            <td width="45px">35 CHF</td>
            <td width="175px">Ausgepack und doch nie gebraucht</td>
        </tr>
        <tr>
            <td width="45px">14 CHF</td>
            <td width="175px">Gebraucht, aber noch in Ordnung</td>
        </tr>
        <tr>
            <td width="45px">15 CHF</td>
            <td width="175px">Gebraucht, aber noch in Ordnung</td>
        </tr>
        <tr>
            <td width="45px">15 CHF</td>
            <td width="175px">Gebraucht, aber noch in Ordnung</td>
        </tr>
    </tbody>
</table>

the query returns the following result:

<tr>
<td width="45px">15 CHF</td>
<td width="175px">Ausgepack und doch nie gebraucht</td>
</tr>
-----------------------
<tr>
<td width="45px">14 CHF</td>
<td width="175px">Gebraucht, aber noch in Ordnung</td>
</tr>
-----------------------
<tr>
<td width="45px">15 CHF</td>
<td width="175px">Gebraucht, aber noch in Ordnung</td>
</tr>
-----------------------
<tr>
<td width="45px">15 CHF</td>
<td width="175px">Gebraucht, aber noch in Ordnung</td>
</tr>

Why are there more nodes returned than one? There should only be exactly one node returned as there is only a single minimum. Does anybody see what's wrong with the query? It should only return the node containing the 14 CHF.

Results obtained using http://xpath.online-toolz.com/tools/xpath-editor.php


Solution

  • In the meantime I decided to use XSLT instead. This is the style sheet that I came up with:

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">
    
        <xsl:output method="text" omit-xml-declaration="yes" indent="no" encoding="UTF-8"/>
        <xsl:strip-space elements="*"/> 
    
        <xsl:template match="//table[@id=\'search-result-0\']/tbody">
            <ul>
                <xsl:for-each select="tr/td[@width=\'45px\']">
                    <xsl:sort select="substring-before(., \' \')" data-type="number" order="ascending"/>
    
                    <xsl:if test="position() = 1">
                         <xsl:value-of select="substring-before(., \' \')"/>
                    </xsl:if>
                </xsl:for-each>
            </ul>
        </xsl:template>
    
        <xsl:template match="text()"/> <!-- ignore the plain text -->
    
    </xsl:stylesheet>