Search code examples
regexrecursiontransformxslt-1.0hl7

Processing hl7 type message using xslt or regex, or combination of two (XSLT 1.0)


so I have this hl7 type message that I have to transform using either regex or xslt or combination of two.

Format of this message is DateTime(as in YYYYMMDDHHMMSS)^UnitName^room^bed|). Each location is separated with a pipe, so each person can have one or multiple locations. And the messages looks like this( when a patient has only one location):

20130602201605^Some Hospital^ABFG^411|

End xml result should look like this:

<Location>
 <item>
  <when>20130602201605</when>
  <UnitName>Some Hospital</UnitName>
  <room>ABFG</room>
  <bed>411</bed>
 </item>
</Location>

I would probably use substring type of function if it was only one location. The problem I am running into is when there is more than one. I am relatively new to xslt and regex in general so I don't know how to use recursion in these instances.

So if I have a message like this with multiple locations:

20130601003203^GBMC^XXYZ^110|20130602130600^Sanai^ABC^|20130602150003^John Hopkins^J615^A|

The end result should be:

<Location>

 <item>
   <when>0130601003203</when>
   <UnitName>GBMC</UnitName>
   <room>XXYZ</room>
   <bed>110</bed>
 </item>

 <item>
  <when>20130602130600</when>
  <UnitName>Sanai</UnitName>
  <room>ABC</room>
  <bed></bed>
 </item>

 <item>
  <when>20130602150003</when>
  <UnitName>John Hopkins</UnitName>
  <room>J615</room>
  <bed>A</bed>
 </item>

</Location>

So how would I solve this? Thanks in advance.


Solution

  • Given that your Hl7 message is "|^~\&" encoded and not in an XML format, it is not clear how you will be using an XSLT 1.0 processor for your task. Can you describe your processing pipeline in greater detail? Your snippets are not complete messages, and it is not clear whether you will be starting with complete messages or attempting to parse isolated fields handed to a larger processing task through parameters or something.

    If your processing starts with a complete HL7 message, I would suggest looking into the HAPI project, or a similar set of libraries, to have the messages converted from |^~\& to </> format, then invoking your XSLT on that version of the data. (You could also use the HAPI libraries in a full-Java solution. In either case, there are code examples at the HAPI site and at an Apache site on HL7.) If you are not interested in using Java at all, but are open to partial non-XSLT solutions, there are other projects that provide similar serialization options (e.g., Net::HL7 for Perl, nHAPI for VB/C#, etc.).

    If you have isolated "|^~\&" encoded data in an otherwise XML formatted file, then I would suggest looking into the str:tokenize function in the XSLT 1.0 exslt functions. (XSLT 2.0 has a built-in tokenize function.) You can have str:tokenize split your data on the field or component separators, then create elements using the tokenized substrings.

    Here is a stylesheet

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet 
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:str="http://exslt.org/strings"
        extension-element-prefixes="str"
        version="1.0">
    
        <xsl:output method="xml" indent="yes"/>
    
        <xsl:template match="data">
            <Location>
            <xsl:for-each select="str:tokenize(.,'|')">
                <xsl:call-template name="handle-field">
                    <xsl:with-param name="field" select="."/>
                </xsl:call-template>
            </xsl:for-each>
            </Location>
        </xsl:template>
    
        <xsl:template name="handle-field">
            <xsl:param name="field"/>
            <xsl:variable name="components" select="str:tokenize($field,'^')"/>
            <item>
                <when><xsl:value-of select="$components[1]"/></when>
                <UnitName><xsl:value-of select="$components[2]"/></UnitName>
                <room><xsl:value-of select="$components[3]"/></room>
                <bed><xsl:value-of select="$components[4]"/></bed>
            </item>
        </xsl:template>
    
    </xsl:stylesheet>
    

    that runs over this input

    <?xml version="1.0" encoding="UTF-8"?>
    <data>20130601003203^GBMC^XXYZ^110|20130602130600^Sanai^ABC^|20130602150003^John Hopkins^J615^A|</data>
    

    to produce this output with xsltproc:

    <?xml version="1.0"?>
    <Location>
      <item>
        <when>20130601003203</when>
        <UnitName>GBMC</UnitName>
        <room>XXYZ</room>
        <bed>110</bed>
      </item>
      <item>
        <when>20130602130600</when>
        <UnitName>Sanai</UnitName>
        <room>ABC</room>
        <bed/>
      </item>
      <item>
        <when>20130602150003</when>
        <UnitName>John Hopkins</UnitName>
        <room>J615</room>
        <bed>A</bed>
      </item>
    </Location>