Search code examples
xsltmsxml

Parsing an XML string to XML and extracting node value in XSLT


I've been trying to parse an XML string to XML and extract the needed values using msxsl:node-set function in XSLT. Given below is the XSLT code

<xsl:template name="parseXML">
    <xsl:param name="xmlString"/>
    <xsl:copy-of select="msxsl:node-set($xmlString)/*"/>
  </xsl:template>


  <xsl:template match="/">
    <xsl:variable name="parsedXML">
        <xsl:call-template name="parseXML">
          <xsl:with-param name="xmlString" select="result/XML"/>
        </xsl:call-template>
      </xsl:variable>
    <element1><xsl:value-of select="$parsedXML"/></element1>
  </xsl:template>

I ran the XSLT on the below XML

<?xml version="1.0" encoding="UTF-8"?>
<result name="HostIntegrationRequest" id="0071" status="0">
  <XML>
    <![CDATA[<Document xmlns="urn:iso:std:iso:20022:tech:xsd:taco.097.001.90"><GrpHdr><MsgId>20230825066</MsgId></GrpHdr></Document>]]></XML>
  <RefId><![CDATA[4]]></RefId>
</result>

But the result XML is returning an empty value, I'm unable to see even the parsedXML value.

Some help to solve this issue is appreciated. If I want to select MsgId can I do

<element1><xsl:value-of select="$parsedXML/Document/GrpHdr/MsgId"/></element1> ?


Solution

  • The node-set() function cannot be used to unescape a string into a proper XML. In XSLT 1.0 you would need to do first:

    <xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    
    <xsl:template match="/result">
        <xsl:value-of select="XML" disable-output-escaping="yes"/>
    </xsl:template>
    
    </xsl:stylesheet>
    

    and then apply another transformation to the resulting file (which in your example would produce an error because GrpHdr does not have an ending tag).

    Alternatively you could try and parse out the data using string manipulations - for example:

    <xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    
    <xsl:template match="/result">
        <result>
            <xsl:value-of select="substring-before(substring-after(XML, '&lt;MsgId>'), '&lt;/MsgId>')"/>
        </result>
    </xsl:template>
    
    </xsl:stylesheet>
    

    will return:

    <?xml version="1.0" encoding="UTF-8"?>
    <result>20230825066</result>
    

    However, this is not truly parsing the XML and can easily fail if the lexical representation of the XML varies.

    In XSLT 3.0 you can use the parse-xml() function to parse the escaped string as XML.