I am attempting to convert a comma delimited list into an XML file with a hierarchical structure. To do this, I am using XSLT alone, preferably with one transform. There is a previous example that is similar, but it does not go into the depth of creating subelements, which I see is a common problem during this kind of transformation that to my knowledge has been left without a clear answer.
Similar Example: XSLT 2.0 to convert CSV to XML format
CSV Example:
ClaimRef,HandlerRef,ClaimType,Date,Area,SettleDate,ClaimStatus,ClaimantName
1,1/1,Liability,08-12-2013,US,23-05-2014,Closed,Mark
2,1/2,Liability,08-10-2013,UK,23-02-2014,Closed,John
Desired XML Output Format (Where this is different because it contains subelements):
<Claims>
<Claim>
<ClaimRef></ClaimRef>
<HandlerRef></HandlerRef>
<ClaimType></ClaimType>
<Date></Date>
<Area></Area>
<SettleDate></SettleDate>
<ImportantDevision>
<ClaimStatus></ClaimStatus>
<ClaimantName></ClaimantName>
</ImportantDivision>
</Claim>
</Claims>
Working XSLT Version 2.0 Without Subelements:
<xsl:param name="inputCsv"/>
<xsl:template match="/" name="csv2xml">
<Claims>
<xsl:variable name="csv" select="unparsed-text($csv-uri, $csv-encoding)"/>
<!--Get Header-->
<xsl:variable name="header-tokens" as="xs:string*">
<xsl:analyze-string select="$csv" regex="\r\n?|\n">
<xsl:non-matching-substring>
<xsl:if test="position()=1">
<xsl:copy-of select="tokenize(.,',')"/>
</xsl:if>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:variable>
<xsl:analyze-string select="$csv" regex="\r\n?|\n">
<xsl:non-matching-substring>
<xsl:if test="not(position()=1)">
<Claim>
<xsl:for-each select="tokenize(.,',')">
<xsl:variable name="pos" select="position()"/>
<xsl:element name="{$header-tokens[$pos]}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
</Claim>
</xsl:if>
</xsl:non-matching-substring>
</xsl:analyze-string>
</Claims>
</xsl:template>
I would then have a dummy XML file with in order trick the XSL to transform my CSV file. Perhaps a better question would be how to distinguish divisions from one another using only XSLT before Tag Names, attributes, ids, etc. are created?
You haven't really explained what are the criteria to nest elements, but as already pointed out in a comment you can transform the flat XML you create first in any way. The following assumes you simply want to nest adjacent elements starting with the name Claim
:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
<xsl:param name="csv-uri" as="xs:string" select="'test2017062301.txt'"/>
<xsl:param name="csv-encoding" as="xs:string" select="'Windows-1252'"/>
<xsl:output indent="yes"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/" name="csv2xml">
<Claims>
<xsl:variable name="csv" select="unparsed-text($csv-uri, $csv-encoding)"/>
<xsl:variable name="flat-xml">
<!--Get Header-->
<xsl:variable name="header-tokens" as="xs:string*">
<xsl:analyze-string select="$csv" regex="\r\n?|\n">
<xsl:non-matching-substring>
<xsl:if test="position() = 1">
<xsl:copy-of select="tokenize(., ',')"/>
</xsl:if>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:variable>
<xsl:analyze-string select="$csv" regex="\r\n?|\n">
<xsl:non-matching-substring>
<xsl:if test="not(position() = 1)">
<Claim>
<xsl:for-each select="tokenize(., ',')">
<xsl:variable name="pos" select="position()"/>
<xsl:element name="{$header-tokens[$pos]}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
</Claim>
</xsl:if>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:variable>
<xsl:apply-templates select="$flat-xml/*"/>
</Claims>
</xsl:template>
<xsl:template match="Claim">
<xsl:copy>
<xsl:for-each-group select="*" group-adjacent="starts-with(local-name(), 'Claim')">
<xsl:choose>
<xsl:when test="current-grouping-key() and current-group()[2]">
<ClaimDivision>
<xsl:copy-of select="current-group()"/>
</ClaimDivision>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Result for your sample input then is
<?xml version="1.0" encoding="UTF-8"?>
<Claims>
<Claim>
<ClaimRef>1</ClaimRef>
<HandlerRef>1/1</HandlerRef>
<ClaimType>Liability</ClaimType>
<Date>08-12-2013</Date>
<Area>US</Area>
<SettleDate>23-05-2014</SettleDate>
<ClaimDivision>
<ClaimStatus>Closed</ClaimStatus>
<ClaimantName>Mark</ClaimantName>
</ClaimDivision>
</Claim>
<Claim>
<ClaimRef>2</ClaimRef>
<HandlerRef>1/2</HandlerRef>
<ClaimType>Liability</ClaimType>
<Date>08-10-2013</Date>
<Area>UK</Area>
<SettleDate>23-02-2014</SettleDate>
<ClaimDivision>
<ClaimStatus>Closed</ClaimStatus>
<ClaimantName>John</ClaimantName>
</ClaimDivision>
</Claim>
</Claims>