Search code examples
xmlxsltxquerylegacybasex

How to find dead code in XSLT legacy codebase


I’ve started with a BaseX v9 XQuery script that helps me to analyze a XSLT legacy codebase. There are ~100 XSL v1.0 files and over the years there is a little bit of a mess.

The code requires a cleanup. A first step is to find unused code (templates, variables etc.).

My question is about finding match templates that are no longer used:

 <xsl:template match="blabla">...</xsl:template>

In general, what is the best way to find match templates (with or with no mode attribute) that will never be executed?


Solution

  • There are two reasons a template rule might never be executed: either there is another template rule that will always be chosen in preference, or there is NO conceivable source document with a node that matches its pattern. It's not really practical to detect either of these situations by static analysis of the stylesheet. Your best bet is to do some code coverage analysis dynamically, using a representative sample of source documents (and other input conditions such as values of stylesheet parameters).

    It's worth googling for "code coverage XSLT" though it will give you more questions than answers. XSpec does code coverage, but only if you have a comprehensive set of XSpec tests, which seems unlikely. Saxon has a facility (-TP:profile.html) that outputs how often each template rule has executed, but sadly, it omits those for which the count is zero. It wouldn't be too hard to combine this data with some source code analysis to find the template rules that don't appear in the list (and also, to combine the outputs from multiple runs with different source documents.)

    An alternative to actually executing the stylesheet against many source documents would be to extract the match patterns into a synthetic stylesheet that tests every input node against every match pattern. You could output an element for every match, then post-process the output to look for patterns missing from the output:

    First:

    <xsl:template match="... a sample pattern... ">
      <match id="654321"/>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:template>
    (repeated for each pattern in the original stylesheet)
    

    Then for the analysis:

    <xsl:variable name="data" select="."/>
    <xsl:key name="k" match="match" use="@id"/>
    <xsl:for-each select="1 to max(//match/@id)[not(key('k', ., $data))]">
      No matches for pattern id="{.}"
    </xsl:for-each>
    

    However, this will give you false "no matches" results for template rules that could be fired by an xsl:apply-imports or xsl:next-match instruction, or for nodes that match more than one template rule (perhaps in different modes). I'm sure the idea could be refined.