I looked around for unflattening procedures through XSL, but none of them really works for me, although I believe my case is pretty simple. I have a collection of HTML, always the same structure, I would like to unflatten through XSL transformation. Basically it is about encapsulating in a <div>
element all the elements following a <p class='subtitle'>
up to the next <p class='subtitle'>
, and – ideally! – still applying transformation to the elements individually, but that is optional (see below).
Source file looks like:
[...some stuff on the page]
<p class='header'>Some text</p>
<p class='subtitle'>Subtitle 1</p>
<p class='content'>First paragraph of part 1, with some <span>Inside</span> and other
nested elements, on multiple levels</p>
<ul>a list with <li> inside</ul>
<p class='content'>Second paragraph of part 1</p>
<img src='xyz.jpg'/>
<p class='content'>Third paragraph of part 1</p>
<p class='subtitle'>Subtitle 2</p>
<p class='content'>First paragraph of part 2</p>
<p class='content'>Second paragraph of part 2</p>
<p class='subtitle'>Subtitle 3
[and so on…]
And I would like to turn this into :
<div n='section1'>
<head>Subtitle 1</head>
<p>First paragraph of part 1, with some <span>Inside</span> and other and other
nested elements, on multiple levels</p>
<ul>a list with <li> inside</ul>
<p>Second paragraph of part 1</p>
<picture source='xyz.jpg'/>
<p>Third paragraph of part 1</p>
</div>
<div n="section2">
<head>Subtitle 2</head>
<p>First paragraph of part 2</p>
<p>Second paragraph of part 2</p>
</div>
<div n="Section 3">
<head>Subtitle 3</head>
[and so on…]
I cannot find my way around this issue. Also, if a first step would only unflatten the HTML file (strictly copying the elements inside the div without transformation), this would already be amazing.
THANKS in advance!
This is a classic positional grouping problem. To get you started:
<xsl:template match="body">
<body>
<xsl:for-each-group select="*" group-starting-with="p[@class='subtitle']">
<xsl:choose>
<xsl:when test="@class="subtitle">
<div n="section{position()}">
<head>{.}</head>
<xsl:apply-templates select="tail(current-group())"/>
</div>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</body>
</xsl:template>
Note that xsl:for-each-group
requires XSLT 2.0 or later. It's considerably more difficult with XSLT 1.0.