I'm trying to write an XSL to tidy up a bit certain XML files (which are Maven's POM). What I want to do is to rearrange the order of certain top elements, remove one element and copy as-is all the rest. An example of the original XML is:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>net.sourceforge.ondex.apps</groupId>
<name>Ondex</name>
<version>0.6.0-SNAPSHOT</version>
<artifactId>installer</artifactId>
<packaging>pom</packaging>
<description>NSIS based Installer</description>
<parent>
<artifactId>apps</artifactId>
<groupId>net.sourceforge.ondex</groupId>
<version>0.6.0-SNAPSHOT</version>
</parent>
<organization>
<name>Ondex Project</name>
<url>http://www.ondex.org</url>
</organization>
<build>
...
</build>
...
</project>
This XML is almost working (with Saxon HE-9-7-06J):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="xs math pom"
xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:pom="http://maven.apache.org/POM/4.0.0"
>
<xsl:output method="xml" indent="yes" />
<xsl:template match="/pom:project">
<project>
<xsl:copy-of select="@*" />
<xsl:apply-templates select="pom:modelVersion" />
<xsl:apply-templates select="pom:parent" />
<xsl:apply-templates select="pom:groupId" />
<xsl:apply-templates select="pom:artifactId" />
<xsl:apply-templates select="pom:name" />
<xsl:apply-templates select="pom:description" />
<xsl:apply-templates
select="node() except (pom:modelVersion|pom:parent|pom:groupId|pom:artifactId|pom:name|pom:description|pom:version)" />
</project>
</xsl:template>
<!-- And the usual identity transform for all other nodes -->
<xsl:template match="node()|@*">
<xsl:copy><xsl:apply-templates select="node()|@*" /></xsl:copy>
</xsl:template>
</xsl:stylesheet>
However, the output has unwanted blank lines added in place of the nodes that are moved (e.g., see the lines after description, where initially I had parent):
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<artifactId>apps</artifactId>
<groupId>net.sourceforge.ondex</groupId>
<version>0.6.0-SNAPSHOT</version>
</parent>
<groupId>net.sourceforge.ondex.apps</groupId>
<artifactId>installer</artifactId>
<name>Ondex</name>
<description>NSIS based Installer</description>
<packaging>pom</packaging>
<organization>
<name>Ondex Project</name>
<url>http://www.ondex.org</url>
</organization>
<build>
...
</build>
...
</project>
What am I doing wrong? Note that I don't want to use xsl:strip-space
, because I want to preserve spaces that are put in the original file for readability purposes.
OK, after the answers and comments you kindly wrote hereby, I've realised what's going on and found a workaround:
As @michael.hor257k explains, the problem is the newline between matched elements (e.g., </parent>
and <organization>
) is matched by XSL as node and reported in the output alone, resulting in empty lines.
<xsl:strip-space>
alone isn't enough, cause it removes these newlines together with manually inserted blank lines, which I want to keep.
But it is a good start: I preprocess the XML with:
sed -E s/'^\s*$'/'<white-line\/>'/ pom.xml | sponge pom.xml
that is, all 'true' white lines are replaced by the tag <white-line />
. So, now it's easy to add this to the XSL above in addition to <xsl:strip-space elements="*" />
:
<xsl:template match="pom:white-line">
<xsl:text>
</xsl:text>
</xsl:template>
Possibly, you might also need to remove starting/trailing blank lines, in order to avoid that they're filled with custom XML outside the root element and thus causing an error.
Thanks for the help!