Search code examples
xsltnewlineline-breaks

How do i remove all linebreaks?


I have something like this:

<node TEXT="   txt A   "/>
<node TEXT="

       txt X

"/>
<node>
   <html>
      <p>
        txt Y
      </p>
   </html>
</node>
<node TEXT="txt B"/>

and i want to use XSLT to get this:

txt A
txt X
txt Y
txt B

I want to strip all useless whitespaces and linebreaks of @TEXT's and CDATA's. The only XML-input that is giving structure to the output are the <node>-tags.


Solution

  • The following transformation:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    
    <xsl:template match="*">
      <xsl:apply-templates select="@TEXT | node()"/>
    </xsl:template>
    
    <xsl:template match="node/@TEXT | text()">
      <xsl:if test="normalize-space(.)">
        <xsl:value-of select=
         "concat(normalize-space(.), '&#xA;')"/>
      </xsl:if>
    
      <xsl:apply-templates />
    </xsl:template>
    
    </xsl:stylesheet>
    

    when applied against this XML document

    <t>
    <node TEXT="   txt A   "/>
    <node TEXT="       txt X"/>
    <node>
        <html>
            <p>        txt Y      </p>
        </html>
    </node>
    <node TEXT="txt B"/>
    </t>
    

    produces the wanted result:

    txt A
    txt X
    txt Y
    txt B

    Do note the use of the standard XPath function normalize-space(), which strips off all leading and trailing spaces and replaces every sequence of other spaces with just one space.