Search code examples
xslt-1.0xproc

Some initial questions about XProc


I have been working with a product that, out-of-box, outputs a "raw" XML syslog in a proprietary (but simple) XML format and we are needing to transform some of the information in some of the messages.

That product supports the ability to configure transforming the "raw" XML into formats that are customized for SIEM products like Splunk and ArcSight using a single XSLT:

Product ==> "raw" XML syslog output ==> SIEM-specific XSLT ==> SIEM (e.g., ArcSight, Splunk, etc.)

Now, we have a requirement to modify ONE of the XML elements in some of the raw syslog messages, before the messages get transformed by the "SIEM-specific" XSLT. So we want to be able to have a flow like:

Product ==> "raw" XML syslog output ==> transformed by "our XSLT" ==> transformed by "SIEM-specific XSLT provided by the product vendor" ==> SIEM (e.g., ArcSight, Splunk, etc.)

I have been working on the "our XSLT" XSLT (to just transform that one element, which would then be passed to the SIEM-specific XSLT), and was posting on one of the XSLT mailing lists, and in one of the responses I got, someone mentioned about possibly using/leveraging XProc to provide the "XSLT chaining", so I have been doing some reading but still have some questions about XProc...

a) Because we are dealing with an existing product as the source of the syslog XML, and because we need to leverage the XSLT processor that is built into the product, which appears to use XALAN-C v1, we are, unfortunately, limited to using XSLT 1.0.

Is this going to prevent us from using XProc?

b) Also, I have been doing the XSLT development on a CENTOS system, and testing my XSLT using xsltproc, and then, after I get it working with xsltproc, I then test the XSLT with the product itself.

As part of the XSLT development process, and the discussions on the XSLT mailing list, I have confirmed that the XSLT engine that the product uses is XALAN-C, and also that the exslt:set-node() function is available and working in both xsltproc and in the product itself.

My original intention was to try to implement the XSLT chaining in our XSLT code, but after doing some research about what would be involved, it seemed like we would essentially have to kind of replicate some of the functionality that is already provided by XProc, so I want to attempt to use XProc for our work, and I want to try to get the XSLT chaining working with XProc, together with our XSLT plus the XSLT provided by the product vendor for the SIEM.

I am posting the above information, but I also wanted to check, given the limitations that we have (e.g., limited to XSLT 1.0), are there any gotchas as far as using XProc to do what I described, in our case?

c) Also, I've seen some mention about Java... Assuming that the product we are working with supports XSLT 1.0 including set-node(), is Java also going to be needed in order to use XProc on the product?

I would be interested in any feedback and thanks in advance!

EDIT 1: I kind of messed up in my original post in the line where I tried to show the "flow" of XSLTs... The SO UI was making some text disappear. I have corrected that line now...

EDIT 2: I think we are kind of close to what might work with the XSLT that Conal posted, but per my comments in response to Conal, I am getting that error:

It is an error to call 'apply-imports' when there's no current template rule.
error: file EmptyDetailsauditrecord.xml
xsltRunStylesheet : run failed

and I don't understand what is causing that error... but I have a question... with the approach that Conal is suggesting, is it required that the "third-party-stylesheet.xsl" XSLT be MODIFIED to add the "mode" to all the xsl:template... INSIDE the "third-party-stylesheet.xsl"??? The error seems to be saying that it isn't any templates in the imported XSLT that are matching, so I'm wondering if maybe the reason that there are no matches is because none of the xsl:template inside the "third-party-stylesheet.xsl" have mode="pre-process" ???

EDIT 3:

The following is the current pipeline.xsl, after I added the code from my XSLT into the pipeline.xsl and also added the additional mode="pre-process". The pipeline.xsl now runs without getting an error, but it is basically taking the input XML and transforming it using the third-party-stylesheet.xsl, instead of first doing the transformation that is in the pipeline.xsl itself first, and then performing the third-party-stylesheet.xsl.

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:exslt="http://exslt.org/common"
      exclude-result-prefixes="exslt"
      version="1.0">

<xsl:import href="./third-party-stylesheet.xsl"/>
<xsl:template mode="pre-process" match="/">
   <xsl:variable name="pre-processed-xml">
      <xsl:apply-templates mode="pre-process" select="/" />
   </xsl:variable>

   <xsl:message><xsl:value-of select="$pre-processed-xml"/></xsl:message>
   <xsl:for-each select="exslt:node-set($pre-processed-xml)">
      <!-- apply the imported templates to the pre-processed-xml -->
      <xsl:apply-imports/>
   </xsl:for-each>
</xsl:template>

<!-- pre-processor generally copies the input -->
<xsl:template match="@*|node()" mode="pre-process">
  <xsl:copy>
    <xsl:apply-templates mode="pre-process" select="@*|node()"/>
  </xsl:copy>
</xsl:template>




    <!-- THE CODE BELOW IS WHAT WAS ORIGINALLY IN  "my XSLT" -->


    <!-- Handle processing of <ExtraDetails>... -->
    <xsl:template match="/syslog/audit_record/ExtraDetails">


    <!-- SET GLOBAL VARIABLE $incomingMessageID... -->
    <xsl:variable name="incomingMessageID" select="/syslog/audit_record/MessageID"/>

    <!-- SET GLOBAL VARIABLE $incomingExtraDetails... -->
    <xsl:variable name="incomingExtraDetails" select="/syslog/audit_record/ExtraDetails"/>



        <xsl:message>+++++++++ UPON ENTERING template match - incomingExtraDetails: [<xsl:value-of select="$incomingExtraDetails"/>]</xsl:message>
        <ExtraDetails>

.
.
.
.
        </ExtraDetails>

    </xsl:template>






</xsl:stylesheet>

EDIT 4A: First XSLT to be run (initialparsetestSMALL.xsl):

      <?xml version="1.0"?>
      <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:exslt="http://exslt.org/common"
      exclude-result-prefixes="exslt"
      version="1.0">

    <!-- SET GLOBAL VARIABLE $incomingMessageID... -->
    <xsl:variable name="incomingMessageID" select="/syslog/audit_record/MessageID"/>

    <!-- SET GLOBAL VARIABLE $incomingExtraDetails... -->
    <xsl:variable name="incomingExtraDetails" select="/syslog/audit_record/ExtraDetails"/>



    <!-- Identity Transform -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>



    <!-- Handle processing of <ExtraDetails>... -->
    <xsl:template match="/syslog/audit_record/ExtraDetails">
        <ExtraDetails>
            <xsl:choose>
                <xsl:when test="$incomingMessageID  = '412'">

                    <xsl:variable name="before-tab" select=" substring-before( substring-after($incomingExtraDetails, '='), '[Tab]' ) "/>
                    <xsl:variable name="after-tab" select=" substring-before( substring-after($incomingExtraDetails, '[Tab]'), ';') "/>
                    <xsl:variable name="front" select=" substring-before($incomingExtraDetails, '=')"/>
                    <xsl:variable name="frontEqual" select="concat($front, '=')"/>
                    <xsl:variable name="back" select=" substring-after($incomingExtraDetails, ';')"/>
                    <xsl:variable name="redacted" select="'************;'"/>
                    <xsl:variable name="newExtraDetails" select="concat($frontEqual,$redacted)"/>
                    <xsl:variable name="newExtraDetailsAll" select="concat($newExtraDetails,$back)"/>

                    <xsl:value-of select="$newExtraDetailsAll"/>
                </xsl:when>


                <xsl:otherwise>
                    <xsl:choose>
                        <xsl:when test="$incomingExtraDetails = ''">
                            <xsl:message>+++++++++++++++++++++++++++++ incomingExtraDetails is null</xsl:message>
                        </xsl:when>
                        <xsl:otherwise>
                            <xsl:message>+++++++++++++++++++++++++++++++++++++ NOT 412</xsl:message>
                            <xsl:value-of select="$incomingExtraDetails"/>
                            <xsl:message>+++++++++++++++++++++++++++++++++++++ DONE NOT 412</xsl:message>
                        </xsl:otherwise>
                    </xsl:choose>

                </xsl:otherwise>

            </xsl:choose>

        </ExtraDetails>

    </xsl:template>

</xsl:stylesheet>

EDIT 4B: Second/Final XSLT to be run (third-party-stylesheet.xsl) - note this XSLT produces text output:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:import href='./RFC5424Changes.xsl'/>
    <xsl:output method='text' version='1.0' encoding='UTF-8'/>
    <xsl:key name="CAProperty_Name" match="CAProperty" use="@Name"/>
    <xsl:template match="/">
        <xsl:apply-imports/>
        <xsl:for-each select="syslog/audit_record"><xsl:if test="not(key('CAProperty_Name','KeyDescription')) and not(key('CAProperty_Name','ApplicationObjectID')) and (Action = 'Retrieve password' or Action = 'Use Password' or Action = 'Retrieve SSH Key' or Action = 'CPM Change Password' or Action = 'CPM Reconcile Password')">|<xsl:value-of select="Issuer"/>|<xsl:value-of select="IsoTimestamp"/>|<xsl:value-of select="Action"/>|<xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'UserName'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'Address'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'PolicyID'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'DeviceType'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'Database'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'AWSAccountID'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'AWSAccessKeyID'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'ActiveDirectoryID'" /></xsl:call-template>
            </xsl:if>
        </xsl:for-each>

        <xsl:for-each select="syslog/audit_record"><xsl:if test="Action = 'Logon' or Action = 'User Authentication'">|<xsl:value-of select="Issuer"/>|<xsl:value-of select="IsoTimestamp"/>|<xsl:value-of select="Action"/>|<xsl:value-of select="Station"/>|</xsl:if>
        </xsl:for-each>

        <xsl:for-each select="syslog/audit_record"><xsl:if test="Action = 'Store password' or Action = 'Store SSH Key'">|<xsl:value-of select="IsoTimestamp"/>|<xsl:value-of select="Action"/>|<xsl:value-of select="Safe"/>|<xsl:value-of select="File"/>|</xsl:if>
        </xsl:for-each>

            <!-- MessageID = '361' is Command Audit                     -->
            <!-- MessageID = '359' is Command Audit                     -->
            <!-- MessageID = '302' is Disconnect Audit  -->
        <xsl:for-each select="syslog/audit_record"><xsl:if test="MessageID = '361' or MessageID = '359' or MessageID = '411' or MessageID = '412' or MessageID = '436' or MessageID = '300' or MessageID = '302'">|<xsl:value-of select="Issuer"/>|<xsl:value-of select="IsoTimestamp"/>|<xsl:value-of select="MessageID"/>|<xsl:value-of select="Action"/>|<xsl:value-of select="Station"/>|<xsl:value-of select="File"/>|<xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'PolicyID'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'Address'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'DeviceType'" /></xsl:call-template><xsl:call-template name="print-file-category"><xsl:with-param name="category-name" select="'Database'" /></xsl:call-template><xsl:value-of select="ExtraDetails"/>|</xsl:if>
        </xsl:for-each>

            <!-- MessageID = '471' is Access Succeeded Syslog -->
        <xsl:for-each select="syslog/audit_record"><xsl:if test="MessageID = '471'">|<xsl:value-of select="IsoTimestamp"/>|<xsl:value-of select="MessageID"/>|<xsl:value-of select="Issuer"/>|<xsl:value-of select="Action"/>|<xsl:value-of select="Station"/>|<xsl:value-of select="ExtraDetails"/>|</xsl:if>
        </xsl:for-each>
    </xsl:template>

    <!-- replace all occurences of the character(s) `from'
         by the string `to' in the string `string'.-->
    <xsl:template name="string-replace" >
        <xsl:param name="string"/>
        <xsl:param name="from"/>
        <xsl:param name="to"/>
        <xsl:choose>
            <xsl:when test="contains($string,$from)">
                <xsl:value-of select="substring-before($string,$from)"/>
                <xsl:value-of select="$to"/>
                <xsl:call-template name="string-replace">
                    <xsl:with-param name="string" select="substring-after($string,$from)"/>
                    <xsl:with-param name="from" select="$from"/>
                    <xsl:with-param name="to" select="$to"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$string"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

    <xsl:template name="print-file-category">
        <xsl:param name="category-name"/>
        <xsl:param name="print-pipe-if-empty" select="true()"/>
        <xsl:variable name="out">
            <xsl:for-each select="CAProperties/CAProperty">
                <xsl:choose>
                    <xsl:when test="@Name=$category-name">
                        <xsl:value-of select="@Value" />
                    </xsl:when>
                </xsl:choose>
            </xsl:for-each>
        </xsl:variable>
        <xsl:value-of select="$out" />
        <xsl:choose>
            <xsl:when test="$print-pipe-if-empty and $out=''">|</xsl:when>
            <xsl:when test="$out!=''">|</xsl:when>
        </xsl:choose>
    </xsl:template>

</xsl:stylesheet>

EDIT 4C: Sample XML Input (sample-syslog-message.xml):

<?xml version="1.0" encoding="UTF-8"?>
<syslog>

  <audit_record>
    <Rfc>yes</Rfc>
    <Timestamp>Jul 21 01:10:30</Timestamp>
    <IsoTimestamp>2024-07-21T01:10:30Z</IsoTimestamp>
    <Hostname>VAU01</Hostname>
    <Vendor>Test</Vendor>
    <Product>Vau</Product>
    <Version>14.0.0000</Version>
    <MessageID>412</MessageID>
    <Desc>logging</Desc>
    <Severity>Info</Severity>
    <Issuer>xxxxx</Issuer>
    <Action>logging</Action>
    <SourceUser/>
    <TargetUser/>
    <File>Root\Operating</File>
    <Station>xx.yy.4.8</Station>
    <Location/>
    <Category/>
    <RequestId/>
    <Reason/>
    <ExtraDetails>Command=11111[Tab]11111;ConnectionComponentId=Users;9T;</ExtraDetails>
    <GatewayStation/>
    <CAProperties>
      <CAProperty Name="PolicyID" Value="Windows"/>
      <CAProperty Name="UserName" Value="xxxxx"/>
      <CAProperty Name="Address" Value="my.solutions"/>
      <CAProperty Name="DeviceType" Value="Operating System"/>
      <CAProperty Name="Disabled" Value="No Reason"/>
      <CAProperty Name="Logon" Value="my"/>
      <CAProperty Name="CreationMethod" Value="VWA"/>
    </CAProperties>
  </audit_record>

</syslog>

EDIT 4D: Small shell script to run the XSLTs in sequence (linux):

#!/bin/bash

xsltproc -o outfromstep1 initialparsetestSMALL.xsl sample-syslog-message.xml && cat outfromstep1

xsltproc -o outfromstep2 third-party-stylesheet.xsl outfromstep1 && cat outfromstep2

EDIT 4E: Example run output (from the third-party-stylesheet.xsl):

[root@nodejs stackoverflowexample]# ./runXslt.sh
<?xml version="1.0"?>
<syslog>

  <audit_record>
    <Rfc>yes</Rfc>
    <Timestamp>Jul 21 01:10:30</Timestamp>
    <IsoTimestamp>2024-07-21T01:10:30Z</IsoTimestamp>
    <Hostname>VAU01</Hostname>
    <Vendor>Test</Vendor>
    <Product>Vau</Product>
    <Version>14.0.0000</Version>
    <MessageID>412</MessageID>
    <Desc>logging</Desc>
    <Severity>Info</Severity>
    <Issuer>xxxxx</Issuer>
    <Action>logging</Action>
    <SourceUser/>
    <TargetUser/>
    <File>Root\Operating</File>
    <Station>xx.yy.4.8</Station>
    <Location/>
    <Category/>
    <RequestId/>
    <Reason/>
    <ExtraDetails>Command=************;ConnectionComponentId=Users;9T;</ExtraDetails>
    <GatewayStation/>
    <CAProperties>
      <CAProperty Name="PolicyID" Value="Windows"/>
      <CAProperty Name="UserName" Value="xxxxx"/>
      <CAProperty Name="Address" Value="my.solutions"/>
      <CAProperty Name="DeviceType" Value="Operating System"/>
      <CAProperty Name="Disabled" Value="No Reason"/>
      <CAProperty Name="Logon" Value="my"/>
      <CAProperty Name="CreationMethod" Value="VWA"/>
    </CAProperties>
  </audit_record>

</syslog>
|xxxxx|2024-07-21T01:10:30Z|412|logging|xx.yy.4.8|Root\Operating|Windows|my.solutions|Operating System||Command=************;ConnectionComponentId=Users;9T;|

Reminder: My goal is to implement an XSLT (that will take an XML input per the sample, and cause the first transformation to be performed on that incoming XML, then perform the 2nd transformation on the output from the 1st transformation.

As discussed ideally we cannot alter the 2nd XSLT (the third-party-stylesheet.xsl, since that is not "our software".

EDIT 5:

I originally didn't include the "pipeline2.xsl" file which has the code that Conal had posted plus the XSLT code for doing the "pre-process", but I'm having a problem because when I test with that "pipeline2.xsl" it is STILL basically skipping processing the pre-process XSLT that is embedded in the "pipeline2.xsl", so I am posting a snippet of the pipeline2.xsl in hopes that someone might be able to tell me why that is happening?

Here is the snippet:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:exslt="http://exslt.org/common"
      exclude-result-prefixes="exslt"
      version="1.0">

<xsl:import href="./third-party-stylesheet.xsl"/>
<xsl:template mode="pre-process" match="/">
   <xsl:variable name="pre-processed-xml">
      <xsl:apply-templates mode="pre-process" select="/" />
   </xsl:variable>

   <xsl:message><xsl:value-of select="$pre-processed-xml"/></xsl:message>
   <xsl:for-each select="exslt:node-set($pre-processed-xml)">
      <!-- apply the imported templates to the pre-processed-xml -->
      <xsl:apply-imports/>
   </xsl:for-each>
</xsl:template>

<!-- pre-processor generally copies the input -->
<xsl:template match="@*|node()" mode="pre-process">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>


<!-- pre-processor should e.g. remove a particular element -->
<!-- <xsl:template match="unwanted-element" mode="pre-process"/> -->




<!-- From Conal's posts, I interpreted that he intended that, by the 2 lines above here, that I should basically add the code from my     initialparsetest.xsl below here, i.e., the initial transformation of the <ExtraDetails>
element is performed in this pipeline XSLT and then after that the code above would cause the transformation in the third-party-stylesheet.xsl to be performed -->


<!-- BUT when I test, the code below (the xsl:template...) does NOT SEEM to be being executed !! -->




    <!-- Handle processing of <ExtraDetails>... -->
    <xsl:template mode="pre-process" match="/syslog/audit_record/ExtraDetails">


    <!-- SET GLOBAL VARIABLE $incomingMessageID... -->
    <xsl:variable name="incomingMessageID" select="/syslog/audit_record/MessageID"/>

    <!-- SET GLOBAL VARIABLE $incomingExtraDetails... -->
    <xsl:variable name="incomingExtraDetails" select="/syslog/audit_record/ExtraDetails"/>


        <xsl:message>+++++++++ UPON ENTERING template match - incomingExtraDetails: [<xsl:value-of select="$incomingExtraDetails"/>]</xsl:message>
        <ExtraDetails>
            <xsl:choose>
                <!-- START 'when' TO HANDLE '412' requests -->
                <xsl:when test="$incomingMessageID  = '412'">
                    <xsl:message>+++++++++++++++++++++++++++++++++++++ Processing a request with MessageId 412</xsl:message>
                    <xsl:message> VARIABLE incomingExtraDetails=[<xsl:value-of select="$incomingExtraDetails"/>] </xsl:message>


.
.
.
.
BUNCH OF CODE 
.
.
.

        </ExtraDetails>

    </xsl:template>



</xsl:stylesheet>

I also found this SO thread:

XSLT template match with mode attribute set doesn't work

which seems to have similar symptoms (XSLT with mode that doesn't get executed), so I am wondering if maybe the problem I am seeing in my pipeline2.xsl is being caused by the same thing (not having an apply-templates?)?

If so, what do I need to add to the pipeline2.xsl to get it to start working?


Solution

  • NB now that the question is a bit clearer, I decided in the interest of clarity to remove my original response, and start from scratch.

    The goal is to have a pipeline which runs two XSLT stylsheets in sequence; firstly a pre-processor.xsl that we define ourselves, whose role is to make some edits to the input data, before it is processed by a third-party-stylesheet.xsl whose contents we don't want to change.

    So we have three XSLT files; the two mentioned above, and the main stylesheet pipeline.xsl whose job is to run the other two.

    pipeline.xsl:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
          xmlns:exslt="http://exslt.org/common"
          exclude-result-prefixes="exslt"
          version="1.0">
    
    <xsl:import href="third-party-stylesheet.xsl"/>
    <xsl:import href="pre-processor.xsl"/>
    
    <xsl:output method="text"/>
    
    <xsl:template match="/">
        <xsl:param name="run-preprocessor" select=" 'yes' "/>
        <xsl:choose>
            <xsl:when test="$run-preprocessor = 'yes' ">
                <xsl:variable name="pre-processed-xml">
                    <xsl:apply-templates mode="pre-process" select="/"/>
                </xsl:variable>
                <xsl:apply-templates select="exslt:node-set($pre-processed-xml)">
                    <xsl:with-param name="run-preprocessor" select=" 'no' "/>
                    <!-- 👇 recurses, but effectively calls the code below -->
                </xsl:apply-templates>
            </xsl:when>
            <xsl:otherwise>
                <!-- effectively called from ☝️ above -->
                <xsl:apply-imports/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    
    </xsl:stylesheet>
    

    The pipeline.xsl imports the other two stylesheets in order to apply templates from both.

    First it runs the pre-processor.xsl (which is written in a pre-process mode to keep its templates from colliding with the third-party-stylesheet.xsl templates).

    The output of the pre-processor.xsl is captured in a variable, which is a "Result Tree Fragment" (because this is XSLT 1.0), so we have to use the exslt:node-set() function to convert that back into a document which we can then apply the templates of third-party-stylesheet.xsl to.

    The only tricky thing here is how, when we call apply-templates again, we get the match="/" template in third-party-stylesheet.xsl to match our pre-processed document, rather than have it match the match="/" template in the pipeline.xsl, and recurse endlessly. The trick is to use a parameter to flag whether the pre-processing stage has been done yet. When we make the recursive apply-templates call with our pre-processed document, we pass a flag to the matching template to say that the pre-processing is done. If our template gets that flag, it can then use the apply-imports statement to yield to the template in the imported third-party-stylesheet.xsl.

    NB because the third-party-stylesheet.xsl uses the text output method, so does our pipeline.xsl.

    Here are my other files: pre-processor.xsl:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    
    <!-- pre-processor generally copies the input -->
    <xsl:template match="@*|node()" mode="pre-process">
      <xsl:copy>
        <xsl:apply-templates select="@*|node()" mode="pre-process"/>
      </xsl:copy>
    </xsl:template>
    
    <!-- pre-processor should e.g. remove a particular element -->
    <xsl:template match="unwanted-element" mode="pre-process"/>
    
    </xsl:stylesheet>
    

    third-party-stylesheet.xsl:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    
       <xsl:output method="text"/>
    
       <xsl:template match="/">
          <xsl:value-of select="."/>
       </xsl:template>
    
    </xsl:stylesheet>
    

    input.xml:

    <example>
        <desired-text>This text should make it past the pre-processor</desired-text>
        <unwanted-element>This element should be filtered out by the pre-processor</unwanted-element>
        <desired-text>This should appear</desired-text>
        <unwanted-element>This should not appear</unwanted-element>
    </example>
    

    Running xsltproc -o output.txt pipeline.xsl input.xml produces:

    output.txt:

    
        This text should make it past the pre-processor
        
        This should appear