Search code examples
javaxmlscalaxstream

Parsing XML with multiple, identical tags


Before adding anything else, I would like to mention that I have looked at other answers here. Unfortunately the answers didn't apply to my situation, and the best one just provided a smattering of code without actually answering anything.

I have XML files with contents such as this:

<TransactionLine status="normal">
 <ItemLine>
  <ItemCode>
   <POSCodeFormat format="upcA"></POSCodeFormat>
   <POSCode>074804007527</POSCode>
   <POSCodeModifier name="pc">1</POSCodeModifier>
  </ItemCode>
  <Description>EP  PK WINS EACH</Description>
  <EntryMethod method="scan"></EntryMethod>
  <ActualSalesPrice>2.99</ActualSalesPrice>
  <MerchandiseCode>1</MerchandiseCode>
  <SellingUnits>1</SellingUnits>
  <RegularSellPrice>2.99</RegularSellPrice>
  <SalesQuantity>1</SalesQuantity>
  <SalesAmount>2.99</SalesAmount>
  <ItemTax>
   <TaxLevelID>101</TaxLevelID>
  </ItemTax>
  <SalesRestriction>
   <SalesRestrictFlag value="no"  type="other"></SalesRestrictFlag>
  </SalesRestriction>
 </ItemLine>
</TransactionLine>
<TransactionLine status="normal">
<ItemLine>
<ItemCode>
 <POSCodeFormat format="upcA"></POSCodeFormat>
 <POSCode>030004344770</POSCode>
 <POSCodeModifier name="pc">1</POSCodeModifier>
</ItemCode>
<Description>MCRFBER TOW EACH</Description>
<EntryMethod method="scan"></EntryMethod>
<ActualSalesPrice>1</ActualSalesPrice>
<MerchandiseCode>1</MerchandiseCode>
<SellingUnits>1</SellingUnits>
<RegularSellPrice>1</RegularSellPrice>
<SalesQuantity>1</SalesQuantity>
<SalesAmount>1</SalesAmount>
<ItemTax>
 <TaxLevelID>101</TaxLevelID>
</ItemTax>
<SalesRestriction>
 <SalesRestrictFlag value="no"  type="other"></SalesRestrictFlag>
</SalesRestriction>
</ItemLine>
</TransactionLine>

A given file will have multiple "Transaction Lines". The differentiating factor between them would be the POS Code. My main problem is, how do I drill down to the point where I can actually use that differentiating value to start tossing information into the necessary objects? Simply removing them as I go isn't an option. I can't control the XML output, so I can't make it more usable. I'm using XStrem as the XML parser. Solutions that are in Java are preferable, but Scala is also okay.


Solution

  • Taking for grant your xml contains a root node, you could use an xpath expression to identify the pertinent node, something like:

    //TransactionLine/ItemLine/ItemCode[POSCode=074804007527]/../..
    

    using XStream should be something like

    String idimlookingfor = "074804007527";
    
    XPathFactory factory = XPathFactory.newInstance();
    XPath xpath = factory.newXPath();
    
    String xpathexpression = String.format("//TransactionLine/ItemLine/ItemCode[POSCode=%s]/../..", idimlookingfor);
    XPathExpression expr = xpath.compile(xpathexpression);
    
    Object result = expr.evaluate(document, XPathConstants.NODESET);
    NodeList nodes = (NodeList) result;