Search code examples
xmlxsltxslt-2.0

XSL Node Swapping doesn't work


I'm trying to write an XSL to tidy up a bit certain XML files (which are Maven's POM). What I want to do is to rearrange the order of certain top elements, remove one element and copy as-is all the rest. An example of the original XML is:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>net.sourceforge.ondex.apps</groupId>
    <name>Ondex</name>
    <version>0.6.0-SNAPSHOT</version>
    <artifactId>installer</artifactId>
    <packaging>pom</packaging>
    <description>NSIS based Installer</description>
    <parent>
        <artifactId>apps</artifactId>
        <groupId>net.sourceforge.ondex</groupId>
        <version>0.6.0-SNAPSHOT</version>
    </parent>
    <organization>
        <name>Ondex Project</name>
        <url>http://www.ondex.org</url>
    </organization>

    <build>
    ...
    </build>
  ...
</project>

This XML is almost working (with Saxon HE-9-7-06J):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="xs math pom"
    xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:pom="http://maven.apache.org/POM/4.0.0"
    >
    <xsl:output method="xml" indent="yes" />

    <xsl:template match="/pom:project">
        <project>
            <xsl:copy-of select="@*" />
            <xsl:apply-templates select="pom:modelVersion" />
            <xsl:apply-templates select="pom:parent" />     
            <xsl:apply-templates select="pom:groupId" />
            <xsl:apply-templates select="pom:artifactId" />
            <xsl:apply-templates select="pom:name" />
            <xsl:apply-templates select="pom:description" />
            <xsl:apply-templates
                select="node() except (pom:modelVersion|pom:parent|pom:groupId|pom:artifactId|pom:name|pom:description|pom:version)" />
        </project>
    </xsl:template>

    <!-- And the usual identity transform for all other nodes --> 
    <xsl:template match="node()|@*">
        <xsl:copy><xsl:apply-templates select="node()|@*" /></xsl:copy>
    </xsl:template>

</xsl:stylesheet>

However, the output has unwanted blank lines added in place of the nodes that are moved (e.g., see the lines after description, where initially I had parent):

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
   <modelVersion>4.0.0</modelVersion>
   <parent>
            <artifactId>apps</artifactId>
            <groupId>net.sourceforge.ondex</groupId>
            <version>0.6.0-SNAPSHOT</version>
      </parent>
   <groupId>net.sourceforge.ondex.apps</groupId>
   <artifactId>installer</artifactId>
   <name>Ondex</name>
   <description>NSIS based Installer</description>





      <packaging>pom</packaging>


      <organization>
            <name>Ondex Project</name>
            <url>http://www.ondex.org</url>
      </organization>

      <build>
      ...
      </build>
  ...
</project>

What am I doing wrong? Note that I don't want to use xsl:strip-space, because I want to preserve spaces that are put in the original file for readability purposes.


Solution

  • OK, after the answers and comments you kindly wrote hereby, I've realised what's going on and found a workaround:

    As @michael.hor257k explains, the problem is the newline between matched elements (e.g., </parent> and <organization>) is matched by XSL as node and reported in the output alone, resulting in empty lines.

    <xsl:strip-space> alone isn't enough, cause it removes these newlines together with manually inserted blank lines, which I want to keep.

    But it is a good start: I preprocess the XML with:

    sed -E s/'^\s*$'/'<white-line\/>'/ pom.xml  | sponge pom.xml
    

    that is, all 'true' white lines are replaced by the tag <white-line />. So, now it's easy to add this to the XSL above in addition to <xsl:strip-space elements="*" />:

    <xsl:template match="pom:white-line">
      <xsl:text>
    
      </xsl:text>
    </xsl:template>
    

    Possibly, you might also need to remove starting/trailing blank lines, in order to avoid that they're filled with custom XML outside the root element and thus causing an error.

    Thanks for the help!