Search code examples
xsltxslt-2.0xslt-grouping

creating a wrapper element for multiple elements with different names and different @class attribute values


I'm have the following flat XML-Structure

<div class="section-level-1">

  <!-- other elements -->

  <p class="para">
    <img src="..." alt="..." title="..." />
  </p>
  <p class="figure-caption-german">
    <img src="..." alt="..." title="..." />
  </p>
  <p class="figure-caption-english">
    <img src="..." alt="..." title="..." />
  </p>

  <!-- other elements -->

  <p class="para">
    <img src="..." alt="..." title="..." />
  </p>
  <p class="figure-caption-german">
    <img src="..." alt="..." title="..." />
  </p>
  <misc-element>...</misc-element>
  <p class="figure-caption-english">
    <img src="..." alt="..." title="..." />
  </p>
</div>

The order of the these elements is always the same (para -> figure-caption-german -> figure-caption-english), however I can't exclude that it will be interrupted by other elements (here the misc-element).

I want to wrap these three elements inside a single element

<div class="section-level-1">

  <!-- other elements -->

  <div class="figure">
    <p class="para">
      <img src="..." alt="..." title="..." />
    </p>
    <p class="figure-caption-german">
      <img src="..." alt="..." title="..." />
    </p>
    <p class="figure-caption-english">
      <img src="..." alt="..." title="..." />
    </p>
  </div>

  <!-- other elements -->

  <div class="figure">
    <p class="para">
      <img src="..." alt="..." title="..." />
    </p>
    <p class="figure-caption-german">
      <img src="..." alt="..." title="..." />
    </p>
    <p class="figure-caption-english">
      <img src="..." alt="..." title="..." />
    </p>
  </div>
</div>

The interrupting element(s) don't need to be preserved and can be deleted.

What I have so far

<xsl:template match="/">
  <xsl:apply-templates />
</xsl:template>

<!-- Html Ninja Pattern -->

<xsl:template match="*">
  <xsl:element name="{name()}">
    <xsl:apply-templates select="* | @* | text()"/>
  </xsl:element>
</xsl:template>

<xsl:template match="body//@*">
  <xsl:attribute name="{name(.)}">
    <xsl:value-of select="."/>
  </xsl:attribute>
</xsl:template>

<!-- Modify certain elements -->

<xsl:template match="" priority="1">
  <!-- do something -->
</xsl:template>

As a basic pattern I draw on the "Html Ninja Technique" (http://getsymphony.com/learn/articles/view/html-ninja-technique/) since it allows me to tackle only those particular elements I need to transform while sending all other elements to the output tree unchanged. So far everything worked fine, but now I really seemed to hit a road block. I'm not even sure I can accomplish the desired task by relying on the "Html Ninja Technique".

Any help or indication would be highly appreciated.

Best regards and thank you, Matthias Einbrodt


Solution

  • It's a little involved, but I think this should do it:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output method="xml" indent="yes"/>
    
      <xsl:template match="*" name="Copy">
        <xsl:element name="{name()}">
          <xsl:apply-templates select="* | @* | text()"/>
        </xsl:element>
      </xsl:template>
    
      <xsl:template match="@*">
        <xsl:attribute name="{name(.)}">
          <xsl:value-of select="."/>
        </xsl:attribute>
      </xsl:template>
    
      <xsl:template match="div[starts-with(@class, 'section-level')]">
        <xsl:copy>
          <xsl:apply-templates select="@*" />
          <!-- Apply templates to paras and anything with no preceding sibling
               or with a figure-caption-english preceding sibling-->
          <xsl:apply-templates select="p[@class = 'para'] | 
                                     *[not(preceding-sibling::*) or
                                        preceding-sibling::*[1][self::p]
                                          [@class = 'figure-caption-english']
                                      ]"
                               mode="iter"/>
        </xsl:copy>
      </xsl:template>
    
      <xsl:template match="p[@class = 'para']" mode="iter">
        <div class="figure">
          <xsl:call-template name="Copy" />
          <!-- Apply templates to the next english and german figure captions -->
          <xsl:apply-templates
            select="following-sibling::p[@class = 'figure-caption-german'][1] |
                    following-sibling::p[@class = 'figure-caption-english'][1]" />
        </div>
      </xsl:template>
    
      <xsl:template match="*" mode="iter">
        <xsl:call-template name="Copy" />
        <xsl:apply-templates 
            select="following-sibling::*[1]
                          [not(self::p[@class = 'para'])]"
            mode="iter"/>
      </xsl:template>
    </xsl:stylesheet>
    

    When applied to this sample data:

    <div class="section-level-1">
    
      <!-- other elements -->
      <div>hello</div>
      <div>hello</div>
      <div>hello</div>
      <div>hello</div>
      <p class="para">
        <img src="..." alt="..." title="..." />
      </p>
      <p class="figure-caption-german">
        <img src="..." alt="..." title="..." />
      </p>
      <p class="figure-caption-english">
        <img src="..." alt="..." title="..." />
      </p>
    
      <!-- other elements -->
      <div>hello</div>
      <div>hello</div>
      <div>hello</div>
    
      <p class="para">
        <img src="..." alt="..." title="..." />
      </p>
      <p class="figure-caption-german">
        <img src="..." alt="..." title="..." />
      </p>
      <misc-element>...</misc-element>
      <p class="figure-caption-english">
        <img src="..." alt="..." title="..." />
      </p>
      <div>hello</div>
      <div>hello</div>
      <div>hello</div>
    </div>
    

    It produces:

    <div class="section-level-1">
      <div>hello</div>
      <div>hello</div>
      <div>hello</div>
      <div>hello</div>
      <div class="figure">
        <p class="para">
          <img src="..." alt="..." title="..." />
        </p>
        <p class="figure-caption-german">
          <img src="..." alt="..." title="..." />
        </p>
        <p class="figure-caption-english">
          <img src="..." alt="..." title="..." />
        </p>
      </div>
      <div>hello</div>
      <div>hello</div>
      <div>hello</div>
      <div class="figure">
        <p class="para">
          <img src="..." alt="..." title="..." />
        </p>
        <p class="figure-caption-german">
          <img src="..." alt="..." title="..." />
        </p>
        <p class="figure-caption-english">
          <img src="..." alt="..." title="..." />
        </p>
      </div>
      <div>hello</div>
      <div>hello</div>
      <div>hello</div>
    </div>