Search code examples
xmlxsltxpathxslt-1.0xslt-2.0

Grabbing Correct Data from XML data (Using XSLT)


I am looking to grab data from multiple nodes, but am having trouble finding a way to make it work the way I want.

Sample Data:

<Records>
  <Record>
    <ID>100</ID>
    <LatestStep>(Offers:1)=9;(Offers:2)=10;(Offers:3)=7</LatestStep>
    <OfferAmount>(Offers:1)=90000.0;(Offers:2)=77000.0;(Offers:3)=75999.0</OfferAmount>
    <StartDate>(Offers:1)=04/24/2019;(Offers:2)=04/26/2019;(Offers:3)=04/28/2019</StartDate>
    <OfferAmount>(Offers:1)=90000.0;(Offers:2)=77000.0;</OfferAmount>
  </Record>
<Records>

I'd like to be able to grab 77000.0 from the OfferAmount field as well as 04/26/2019 from StartDate. The logic I need to create in XSLT is to find what Offer has a latest step of 10 in LatestStep. Then, grab the data after the equal sign.

    <!-- Current Code (example) -->
    <xsl:variable name="record" select="."/>
    <xsl:variable name="offers">
        <xsl:analyze-string select="LatestStep"regex="\(Offers:([\d]+)\)=10">
            <xsl:matching-substring>
                <offer>
                    <payAmount>
                        <xsl:value of select="tokenize(replace($record/OfferAmount, '\(Offers:[\d]+\)=',''),';')
                    </payAmount>
                </offer>

Solution

  • With this kind of data, it's often better to tackle this in multiple phases. First phase: turn it into structured XML, the kind of XML that you would have preferred to be given in the first place; second phase, grab the actual data you need.

    The reason for this is that the first phase is often reusable; you can apply the same preprocessing to the data regardless what you want to do with it afterwards.

    I've no idea what the actual data model is, or what you might find in other examples of your input, but if you wanted to turn

    <StartDate>(Offers:1)=04/24/2019;(Offers:2)=04/26/2019;(Offers:3)=04/28/2019</StartDate>
    

    into

    <StartDate>
      <Offers nr="1">2019-04-24</Offers>
      <Offers nr="2">2019-04-26</Offers>
      <Offers nr="3">2019-04-28</Offers>
    </StartDate>
    

    Then you could do this with

    <xsl:template match="StartDate|...">
      <xsl:copy>
        <xsl:for-each select="tokenize(., ';')">
          <Offers nr="{position()}>
            <xsl:value-of select="my:us-date-to-iso(substring-after(., '='))"/>
          </Offers>
        </xsl:for-each>
      </xsl:copy>
    </xsl:template>
    

    where my:us-date-to-iso converts American (mm/dd/yyyy) dates to ISO format in the usual way.

    Then the second phase becomes trivial.