Search code examples
xmlcountxml-parsingxqueryxquery-3.0

Why is the XQuery expression returning the wrong element count?


I want return the number of L1 and L0 items per schedule and also the number of warnings, also per schedule.

It is actually a "count if" situation.

I tried the following XQuery, which counts L1, L0 and warnings fine, but does count all warnings instead of only the ones with value = "yes".

xquery version "3.0";

let $nl := "
"
let $quote := """
let $pipe := "|"
let $nodecount := 0

for $profiles in doc("maik test.xml")/PROFILE
for $schedule in $profiles/SCHEDULE
let $schedulename := $schedule/@name
group by $schedulename
return ($nl, 
$schedulename, $pipe, "L0 count:", count($schedule/L0),
$pipe, "L0 Warnings:", count($schedule/L0/ATTRIBUTE[@NAME = "Warnings"]/VALUE/string() = "Yes"),
$pipe, "L1 count:", count($schedule/L0/L1),
$pipe, "L1 Warnings:", count($schedule/L0/L1/ATTRIBUTE[@NAME = "Warnings"]/VALUE/string() = "Yes"))

Example XML:

<?xml version="1.0" encoding="UTF-8"?>
<PROFILE name="profile1">
    <SCHEDULE name="schedule1">
        <L0>
            <ATTRIBUTE NAME="Warnings">
                <VALUE>No</VALUE>
            </ATTRIBUTE>
            <L1>
                <ATTRIBUTE NAME="Warnings">
                    <VALUE>No</VALUE>
                </ATTRIBUTE>
            </L1>
            <L1> 
                <ATTRIBUTE NAME="Warnings">
                    <VALUE>No</VALUE>
                </ATTRIBUTE>
            </L1>
            <L1>
                <ATTRIBUTE NAME="Warnings">
                    <VALUE>Yes</VALUE>
                </ATTRIBUTE>
            </L1>
            <L1></L1>
        </L0>
        <L0>
            <ATTRIBUTE NAME="Warnings">
                <VALUE>No</VALUE>
            </ATTRIBUTE>
            <L1></L1>
        </L0>
    </SCHEDULE>
    <SCHEDULE name="schedule2">
        <L0>
            <L1></L1>
            <L1></L1>
            <L1></L1>
            <L1></L1>
        </L0>
        <L0>
            <L1></L1>
        </L0>
        <L0>
            <L1></L1>
            <L1></L1>
            <L1></L1>
        </L0>
    </SCHEDULE>
</PROFILE>

Solution

  • Actually, you're also counting the L0-warnings wrong -- but by chance, they're result is the correct one.

    Try to return $schedule/L0/L1/ATTRIBUTE[@NAME = "Warnings"]/VALUE/string() = "Yes" (without the aggregation) to get a feeling what is going wrong. This subquery returns a boolean which is true if any of the values on the left side matches any of the value on the right side (which in this case, is only a single one). With other words, if any of the warnings is Yes, the statement gets true. If none match at all, false is returned. In XQuery, = has a set-semantics.

    Now, you're counting the number of results, which -- no matter what the boolean result was -- is 1.

    To wrap up, there are two problems in the code:

    • Don't use = if you want to compare single elements, use eq instead (there are also lq, gq, le, ge and ne for <, >, <=, >=, !=).
    • Don't compare the result sequence (which will return a boolean value), but filter it using a predicate.

    A proper subquery for the L0 warnings would be

    count($schedule/L0/ATTRIBUTE[@NME = "Warnings"]/VALUE[string() eq "Yes"])
    

    and similarly for the L1 warnings:

    count($schedule/L0/L1/ATTRIBUTE[@NAME = "Warnings"]/VALUE[string() eq "Yes"])