Search code examples
xmlxslt

Easy way to apply an XSLT to an xml file


I have an xml file structure that has very cryptic tags (it is HL7 2.3 Segment Names).

I know what all the tags mean in normal English. But looking them up every time I have to read the file is a pain.

As I understand it, I can make an XSLT that will allow me to view my xml file with it formatted in an easy to read manner.

Once I make this XSLT, is there a very quick and easy way to apply it to my xml document (ie copy and paste it from SQL or MSMQ and have it just auto format to the XSLT definition?

(NOTE: The XML files contain data that I cannot load to the web in any way. So any public hosted web solutions will not work.)

Update for Bounty:
Added a bounty for Maestro13's answer modified to match on parent and current node. (Rather than just current node.)

Since this is Maestro13's answer in the first place, I will award the bounty to him over other similar answers.


Solution

  • I wrote a small example for you below.

    To run this, you will need a tool like XML Spy, Oxygen, or install Java JDK and Saxon 9HE and run it from the command line as explained by Alp. Not sure how the translation file is specified in Saxon call or in XSLT itself but see Saxon command line syntax; in Oxygen or XML Spy this is done in a parameter list popup.

    Input file (not a real example, just an extract of HL7-like XML):

    <?xml version="1.0" encoding="UTF-8"?>
    <ADT_A03>
        <MSH>
            <MSH.7>19900314130405</MSH.7>
        </MSH>
        <EVN>
            <EVN.6>19980327095000</EVN.6>
        </EVN>
        <PID>
            <PID.4.LST>
                <PID.4>
                    <CX.1>123456789ABCDEF</CX.1>
                </PID.4>
            </PID.4.LST>
            <PID.5.LST>
                <PID.5>
                    <XPN.1>PATIENT</XPN.1>
                    <XPN.2>BOB</XPN.2>
                    <XPN.3>S</XPN.3>
                </PID.5>
            </PID.5.LST>
        </PID>
    </ADT_A03>
    

    Additional XML file with translations of HL7 tags:

    <?xml version="1.0" encoding="UTF-8"?>
    <HL7_translations>
        <MSH description="MessageHeader">
            <MSH.7 description="DateTimeOfMessage"/>
        </MSH>
        <EVN description="EventType">
            <EVN.6 description="EventOccurred"/>
        </EVN>
        <PID description="PatientIdentification">
            <PID.1 description="SetID_PatientID"/>
            <PID.2 description="PatientID_ExternalID"/>
            <PID.3 description="PatientID_InternalID"/>
            <PID.4 description="AlternatePatientID_PID"/>
            <PID.5 description="PatientName"/>
        </PID>
    </HL7_translations>
    

    XSL transformation, reading and processing both files (param is the translation file; input source file is the HL7 XML):

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    
    <xsl:param name="HL7_translations"/>
    
    <xsl:variable name="doc" select="document($HL7_translations)"/>
    
    <xsl:template match="/">
        <xsl:apply-templates/>
    </xsl:template>
    
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    
    <xsl:template match="*">
        <xsl:variable name="trans" select="$doc//*[name() = name(current())]/@description"/>
        <xsl:copy>
            <xsl:if test="$trans">
                <xsl:attribute name="description"><xsl:value-of select="$trans"/></xsl:attribute>
            </xsl:if>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>
    
    </xsl:stylesheet>
    

    Output result:

    <?xml version="1.0" encoding="UTF-8"?>
    <ADT_A03>
        <MSH description="MessageHeader">
            <MSH.7 description="DateTimeOfMessage">19900314130405</MSH.7>
        </MSH>
        <EVN description="EventType">
            <EVN.6 description="EventOccurred">19980327095000</EVN.6>
        </EVN>
        <PID description="PatientIdentification">
            <PID.4.LST>
                <PID.4 description="AlternatePatientID_PID">
                    <CX.1>123456789ABCDEF</CX.1>
                </PID.4>
            </PID.4.LST>
            <PID.5.LST>
                <PID.5 description="PatientName">
                    <XPN.1>PATIENT</XPN.1>
                    <XPN.2>BOB</XPN.2>
                    <XPN.3>S</XPN.3>
                </PID.5>
            </PID.5.LST>
        </PID>
    </ADT_A03>
    

    EDIT to include parent node when retrieving descriptions

    new XML input

    <?xml version="1.0" encoding="UTF-8"?>
    <ADT_A03>
        <MSH>
            <MSH.7>19900314130405</MSH.7>
        </MSH>
        <EVN>
            <EVN.6>19980327095000</EVN.6>
        </EVN>
        <PID>
            <PID.4.LST>
                <PID.4>
                    <CX.1>123456789ABCDEF</CX.1>
                </PID.4>
            </PID.4.LST>
            <PID.5.LST>
                <PID.5>
                    <XPN.1>PATIENT</XPN.1>
                    <XPN.2>BOB</XPN.2>
                    <XPN.3>S</XPN.3>
                </PID.5>
            </PID.5.LST>
        </PID>
        <PV1>
            <PV1.7>
                <XCN.1>Attending Doctor's name</XCN.1>
            </PV1.7>
            <PV1.8>
                <XCN.1>Referring Doctor's name</XCN.1>
            </PV1.8>
        </PV1>
        <OBR>
            <OBR.10>
                <XCN.1>Collector Identifier's name</XCN.1>
            </OBR.10>
        </OBR>
    </ADT_A03>
    

    new file with descriptions (note the addition of intermediate levels like PID.4.LST !)

    <?xml version="1.0" encoding="UTF-8"?>
    <HL7_translations>
        <MSH description="MessageHeader">
            <MSH.7 description="DateTimeOfMessage"/>
        </MSH>
        <EVN description="EventType">
            <EVN.6 description="EventOccurred"/>
        </EVN>
        <PID description="PatientIdentification">
            <PID.1 description="SetID_PatientID"/>
            <PID.2 description="PatientID_ExternalID"/>
            <PID.3 description="PatientID_InternalID"/>
            <PID.4.LST>
                <PID.4 description="AlternatePatientID_PID"/>
            </PID.4.LST>
            <PID.5.LST>
                <PID.5 description="PatientName"/>
            </PID.5.LST>
        </PID>
        <PV1>
            <PV1.7>
                <XCN.1 description="Attending Doctor"/>
            </PV1.7>
            <PV1.8>
                <XCN.1 description="Referring Doctor"/>
            </PV1.8>
        </PV1>
        <OBR>
            <OBR.10>
                <XCN.1 description="Collector Identifier"/>
            </OBR.10>
        </OBR>
    </HL7_translations>
    

    new xslt

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    
    <xsl:param name="HL7_translations"/>
    
    <xsl:variable name="doc" select="document($HL7_translations)"/>
    
    <xsl:template match="/">
        <xsl:apply-templates/>
    </xsl:template>
    
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    
    <xsl:template match="*">
        <xsl:variable name="trans" select="$doc//*[ name() = name(current()) and 
            ( name(current()/..) = name(..) or name(..) = 'HL7_translations')
            ]"/>
        <xsl:copy>
            <xsl:if test="$trans/@description">
                <xsl:attribute name="description"><xsl:value-of select="$trans/@description"/></xsl:attribute>
            </xsl:if>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>
    
    </xsl:stylesheet>
    

    so the only change is

    <xsl:variable name="trans" select="$doc//*[ name() = name(current()) and 
        ( name(current()/..) = name(..) or name(..) = 'HL7_translations')
        ]"/>
    

    plus I moved the /@description out of the variable.

    new output result

    <?xml version="1.0" encoding="UTF-8"?>
    <ADT_A03>
        <MSH description="MessageHeader">
            <MSH.7 description="DateTimeOfMessage">19900314130405</MSH.7>
        </MSH>
        <EVN description="EventType">
            <EVN.6 description="EventOccurred">19980327095000</EVN.6>
        </EVN>
        <PID description="PatientIdentification">
            <PID.4.LST>
                <PID.4 description="AlternatePatientID_PID">
                    <CX.1>123456789ABCDEF</CX.1>
                </PID.4>
            </PID.4.LST>
            <PID.5.LST>
                <PID.5 description="PatientName">
                    <XPN.1>PATIENT</XPN.1>
                    <XPN.2>BOB</XPN.2>
                    <XPN.3>S</XPN.3>
                </PID.5>
            </PID.5.LST>
        </PID>
        <PV1>
            <PV1.7>
                <XCN.1 description="Attending Doctor">Attending Doctor's name</XCN.1>
            </PV1.7>
            <PV1.8>
                <XCN.1 description="Referring Doctor">Referring Doctor's name</XCN.1>
            </PV1.8>
        </PV1>
        <OBR>
            <OBR.10>
                <XCN.1 description="Collector Identifier">Collector Identifier's name</XCN.1>
            </OBR.10>
        </OBR>
    </ADT_A03>