Search code examples
xmlxpathxpathquery

XPath Query to parse all IDREFS in an attribute (containing possibly many of IDs)


I need to come up with a query that gives the products of the types from which no items were sold Meaning if an item is of the type clothing, and no clothing items appear in the list of transactions, I need to display it.

This is my XML file (apologies for the super-Canadian-ness):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE store [

<!ELEMENT store (product*, transaction*)> 
<!ATTLIST store name CDATA #REQUIRED > 

<!ELEMENT product EMPTY> 
    <!ATTLIST product 
        name ID #REQUIRED 
        type CDATA #REQUIRED 
        price CDATA #REQUIRED 
    > 

    <!ELEMENT transaction EMPTY> 
    <!ATTLIST transaction 
products IDREFS #REQUIRED 
sumPrice CDATA #REQUIRED 
    >

]>
<store name="Gordons">
<product name="beaverCoat" type="clothing" price="100"/>
<product name="hockeyStick" type="equipment" price="30"/>
<product name="hockeyPuck" type="equipment" price="5"/>
<product name="icePick" type="equipment" price="40"/>
<product name="mooseMeat" type="food" price="350"/>
<product name="salmon" type="food" price="15"/>
<transaction products="hockeyPuck hockeyStick" sumPrice="35"/>
<transaction products="hockeyStick mooseMeat" sumPrice="380"/>
<transaction products="salmon mooseMeat" sumPrice="365"/>
<transaction products="hockeyStick hockeyStick hockeyStick" sumPrice="30"/>
</store>

DESIRED OUTPUT

<product name="beaverCoat" type="clothing"/> because it's a product from a category (clothing) from which nothing was bought. i.e. no transactions include clothing.

MY ATTEMPT I've played around with some queries but I just can't get it right. This is the closest I've gotten:

//product[@type != //transactions/@products/@type]

It seems like this should work - find all the products whose type is no equal to any of the type in all of the transactions however I am getting lots of errors.

I would really appreciate if someone could provide the solution with a little explanation.


Solution

  • You can use the id() function to get the node-set of all items that have been sold, using it on the node-set of products attributes of transaction elements:

    id(//transaction/@products)
    

    and you can easily extend that to get the type of items that have been sold:

    id(//transaction/@products)/@type
    

    What you want is all products where the type is not in this set, which is given by:

    //product[not(@type = id(//transaction/@products)/@type)] 
    

    Using this on your example XML selects only beaverCoat product node.