Search code examples
xmlxpathxpathquery

XPath Query: Find elements that have the same list of IDs as attributes


I need to come up with a query that gives the products of the types from which no items were sold Meaning if an item is of the type clothing, and no clothing items appear in the list of transactions, I need to display it.

This is my XML file (apologies for the super-Canadian-ness):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE store [

<!ELEMENT store (product*, transaction*)> 
<!ATTLIST store name CDATA #REQUIRED > 

<!ELEMENT product EMPTY> 
    <!ATTLIST product 
        name ID #REQUIRED 
        type CDATA #REQUIRED 
        price CDATA #REQUIRED 
    > 

    <!ELEMENT transaction EMPTY> 
    <!ATTLIST transaction 
products IDREFS #REQUIRED 
sumPrice CDATA #REQUIRED 
    >

]>
<store name="Gordons">
<product name="beaverCoat" type="clothing" price="100"/>
<product name="hockeyStick" type="equipment" price="30"/>
<product name="hockeyPuck" type="equipment" price="5"/>
<product name="icePick" type="equipment" price="40"/>
<product name="mooseMeat" type="food" price="350"/>
<product name="salmon" type="food" price="15"/>
<transaction products="salmon mooseMeat" sumPrice="365"/>    
<transaction products="hockeyPuck hockeyStick" sumPrice="35"/>
<transaction products="hockeyStick mooseMeat" sumPrice="380"/>
<transaction products="salmon mooseMeat" sumPrice="300"/>
<transaction products="hockeyStick hockeyStick hockeyStick" sumPrice="30"/>
</store>

DESIRED OUTPUT

<transaction products="salmon mooseMeat" sumPrice="365"/>
<transaction products="salmon mooseMeat" sumPrice="300"/>

because they are transactions that have the same products as another transaction (eachother)

MY ATTEMPT I've played around with some queries but I just can't get it right. This is the closest I've gotten:

This is what I've tried:

//transaction[id(@products)  = //transaction/@products]

It seems like this should work - find all the transactions whose products all match the products attribute of other transactions. However it is getting no hits.


Solution

  • EDIT: For some reason I thought this was an XSLT question. Here's one way you could do this with just XPath:

    //transaction[(@products = preceding-sibling::transaction/@products or
                   @products = following-sibling::transaction/@products)]
    

    Here's how you could query all such distinct elements (requires XPath 2.0):

    //transaction[(@products = preceding-sibling::transaction/@products or
                   @products = following-sibling::transaction/@products) and
                   not(concat(@products, '+', @sumPrice) = 
                       preceding-sibling::transaction/concat(@products, '+', @sumPrice))]
    

    Here's one way to do this with XSLT:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
      <xsl:key name="kTrans" match="transaction" use="@products" />
    
      <xsl:template match="@* | node()">
        <xsl:copy>
          <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
      </xsl:template>
    
      <xsl:template match="/*">
        <n>
          <xsl:apply-templates select="transaction[key('kTrans', @products)[2]]" />
        </n>
      </xsl:template>
    </xsl:stylesheet>
    

    When run on your sample input, the result is:

    <n>
      <transaction products="salmon mooseMeat" sumPrice="365" />
      <transaction products="salmon mooseMeat" sumPrice="300" />
    </n>
    

    Note that I've wrapped the result in an n element because it's not valid to have XML with more than one root element.