Search code examples
htmlxmlxsltcdata

How do i set an exception in xslt for a special element for CDATA


here is the thing. I have a Viewer which can read XML and also html. Now what I want to achieve is the following. I want to have text from a <note> transformed to CDATA (check), but if in the <note> a specific element is used, *let's call the element <html>, all text in that element should not be transformed to CDATA.

Is this even possible? I tried already different things but did not found a solution yet which helped me with my case.

The input data is this:

<infobox>
 <title>HTML TEST</title>
   <note><b>XML in the Note</b><html html="yes">&lt;h1&gt;Headline&lt;/h1&gt;
         &lt;h2 style="color:green"&gt; Grüne Schrift &lt;/h2&gt;
         &lt;h3 style="color:black"&gt; Schwarze Schrift &lt;/h3&gt;
         &lt;h4 style="color:#FF0000"&gt; Rote Schrift &lt;/h4&gt;</html>
   </note>
</infobox>

My XSL part for the note is this:

    <xsl:template match="note[not(ancestor::info) and not(ancestor::item-content) and not(ancestor::context)]">
      <note type="{@type}">
        <xsl:text disable-output-escaping="yes">&lt;![CDATA[</xsl:text>
        <div class="{$noteClass}">
        <xsl:apply-templates/>
        </div>
        <xsl:text disable-output-escaping="yes">]]&gt;</xsl:text>
      </note>
    </xsl:template>

My published outcome is this:

      <infobox>
        <title id="CAP0e6ba25c-a8a1-45fb-a957-43c3195f9c326">HTML TEST</title>
        <note type=""><![CDATA[<div class="NoteContent">
                  <b>XML in the Note</b>
                  <html>&lt;<div class="html-content">&lt;h1&gt;Headline&lt;/h1&gt;&lt;h2 style="color:green"&gt; Grüne Schrift &lt;/h2&gt;
&lt;h3 style="color:black"&gt; Schwarze Schrift &lt;/h3&gt;
&lt;h4 style="color:#FF0000"&gt; Rote Schrift &lt;/h4&gt;</div>]]&gt;</html>
               </div>]]></note>
      </infobox>

My wanted published outcome is this:

      <infobox>
        <title id="CAP0e6ba25c-a8a1-45fb-a957-43c3195f9c326">HTML TEST</title>
        <note type=""><![CDATA[<div class="NoteContent">
                  <b>XML in the Note</b></div>]]><html>&lt;h1&gt;Headline&lt;/h1&gt;&lt;h2 style="color:green"&gt; Grüne Schrift &lt;/h2&gt; &lt;h3 style="color:black"&gt; Schwarze Schrift &lt;/h3&gt; &lt;h4 style="color:#FF0000"&gt; Rote Schrift &lt;/h4&gt;</html>

        </note>
      </infobox>

So you see, the <html> element should be left out of the transformation.

I am open for any suggestion for editing the shown xsl part or for any additional xsl input.

Thank you all.


Solution

  • Selecting non html elements and wrapping them in a div that is serialized is easily possible in XSLT 3:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        exclude-result-prefixes="#all"
        version="3.0">
    
      <xsl:mode on-no-match="shallow-copy"/>
    
      <xsl:template match="note/*[not(self::html)]">
          <xsl:variable name="div-wrapper" as="element(div)">
              <div class="note">
                  <xsl:apply-templates/>
              </div>          
          </xsl:variable>
          <xsl:value-of select="serialize($div-wrapper)"/>
      </xsl:template>
    
    </xsl:stylesheet>
    

    https://xsltfiddle.liberty-development.net/a9GPfA

    Creating the CDATA sections could be delegated to <xsl:output cdata-section-elements="note"/>, see https://xsltfiddle.liberty-development.net/a9GPfA/1, but as you can see there it is harder to control what happens with white space only content inside of a note, in your sample you will get a CDATA section with white space after the html element.