Search code examples
xsltxsdxslt-2.0xslt-3.0

XSLT to process plain text file to XML using xslt 2.0 or higher


I am working with a client who uses 'Workday' ERP. This ERP mainly deals with XML, XSLT and XSD scripting but not other programming languages to transform the data in and out of the ERP.

I have a fixed-length text file (sample Below) that I am trying to convert it to XML for further processing in my code. I have always used XSLT to convert xml to xml (OR) xml to text but not vice versa.

Can you please guide me or provide a sample XSLT (2.0 or 3.0) to convert the below text data into target XML (below).

Input Fixed Length File: (First Character is record Type, X, H are headers, the last T, F are trailers. Each Employee record starts with 1 E record, followed by multiple W records and B records (Optional)).

X T3.03Q2020320201029015631AACW2                                                                                                                               xxxxxxx                  2020xx                            090420                                
H ZXCV          20200930      ABCABCA ABCABC                                     
E ******13662       372022456           Tony             B                StarkS              99999 Heritage Pkwy                                         zzzzzz                        MI48092                   YNNNMS19960706        19720724               PM                                 99999 Heritage Pkwy                                                             zzzzzz                        MI48092             
WW_SWW26                            61322         1524206         1442835         1442835               0               0               0               0             0               0            215611         5342667         5073153         5073153                               0               0                               0                          NN                 0               0   N  N       0000000000YYY 14  440            0             0             0             0             0   0N                                                                                                                                                                                                                                      
WW_CITYR2665440                      9192          972143          919215          919215               0               0               0               0             0               0              9192          972143          919215          919215                               0               0                               0                          NN                 0               0   N  N       0000000000NYY 14  440            0             0             0             0             0   0N                                                                                                                                                                                                                                      
BW_OASFEDERAL                       93217         1524206         1503506         1503506               0               0               0               0             0               0            327181         5342667         5277117         5277117                               0               0                               0                          NN                 0               0   N  N       0000000000YYY 14  440            0             0             0             0             0   0N                                                                                                                                                                                                                                      
E ******10665       362022493           Thor             S                Asar                2323 Clyde Road                                             Highzzzz                      MI48357                   YNNNMS19990517        19760301               PM                                 2323 Clyde Road                                                                 Highzzzz                      MI48357             
WW_SWW26                            61322         1524206         1442835         1442835               0               0               0               0             0               0            215611         5342667         5073153         5073153                               0               0                               0                          NN                 0               0   N  N       0000000000YYY 14  440            0             0             0             0             0   0N                                                                                                                                                                                                                                      
WW_CITYR2665440                      9192          972143          919215          919215               0               0               0               0             0               0              9192          972143          919215          919215                               0               0                               0                          NN                 0               0   N  N       0000000000NYY 14  440            0             0             0             0             0   0N                                                                                                                                                                                                                                      
BW_OASFEDERAL                       93217         1524206         1503506         1503506               0               0               0               0             0               0            327181         5342667         5277117         5277117                               0               0                               0                          NN                 0               0   N  N       0000000000YYY 14  440            0             0             0             0             0   0N                                                                                                                                                                                                                                      
BW_OASFEDERAL                       93217         1524206         1503506         1503506               0               0               0               0             0               0            327181         5342667         5277117         5277117                               0               0                               0                          NN                 0               0   N  N       0000000000YYY 14  440            0             0             0             0             0   0N                                                                                                                                                                                                                                      
T        39384       1699589934 
F        43442       1854024842 

The expected XMl output is something like below:

<?xml version='1.0' encoding='utf-8'?>
<File>
    <X_Header></X_Header>
    <H_Header></H_Header>
    <All_Employees>
        <Employee>
            <E_record></E_record>
            <W_record></W_record>
            <W_record></W_record>
            <W_record></W_record>
            <B_record></B_record>
        </Employee>
        <Employee>
            <E_record></E_record>
            <W_record></W_record>
            <W_record></W_record>
            <W_record></W_record>
            <B_record></B_record>
        </Employee>
    </All_Employees>
    <T_Trailer></T_Trailer>
    <F_Trailer></F_Trailer>
</File>

Solution

  • So XSLT 3 code could use e.g.

      <xsl:param name="lines" select="unparsed-text-lines('file.txt')"/>
      
      <xsl:template match=".[. instance of xs:string]" mode="header">
          <xsl:element name="{substring(., 1, 1)}_Header">
              <xsl:apply-templates select="tokenize(., '\s+')" mode="data"/>
          </xsl:element>
      </xsl:template>
      
      <xsl:template match=".[. instance of xs:string]" mode="trailer">
          <xsl:element name="{substring(., 1, 1)}_Trailer">
              <xsl:apply-templates select="tokenize(., '\s+')" mode="data"/>
          </xsl:element>
      </xsl:template>
      
      <xsl:template match=".[. instance of xs:string]">
          <xsl:element name="{substring(., 1, 1)}_Record">
              <xsl:apply-templates select="tokenize(., '\s+')" mode="data"/>
          </xsl:element>
      </xsl:template>
      
      <xsl:template match="." mode="data" expand-text="yes">
          <Data>{.}</Data>
      </xsl:template>
    
      <xsl:template match="/" name="xsl:initial-template">
        <File>
            <xsl:apply-templates mode="header" select="$lines[starts-with(., 'H') or starts-with(., 'X')]"/>
            <All_Employees>
                <xsl:for-each-group select="$lines[not(matches(., '^[HXTF]'))]" group-starting-with=".[starts-with(., 'E')]">
                    <Employee>
                        <xsl:apply-templates select="current-group()"/>
                    </Employee>
                </xsl:for-each-group>
            </All_Employees>
            <xsl:apply-templates mode="trailer" select="$lines[starts-with(., 'T') or starts-with(., 'F')]"/>
        </File>
      </xsl:template>
    

    You haven't spelled out how to parse each line but you can easily adapt the tokenization and the templates.