Search code examples
xmlxsltxslt-1.0xslt-groupinglibxslt

Wrap everything between two tag occurrences with XSLT 1.0


I want to split an XML (actually, XHTML) document in sections by the top-level h1 tags. Everything starting at the first of <h1> up to the next one should be wrapped in <section> element and so on, until the end of the document.

For example, if I have this source document:

<article>
    <h1>Heading 1</h1>
    <p>Some text</p>
    <p>Some more text</p>

    <h1>Heading 2</h1>
    <p>Some text</p>
    <h2>Subheading</h2>
    <p>Some text</p>

    <h1 id="heading3">Heading 3</h1>
    <p>Some text</p>
</article>

I want the result to be exactly like this:

<article>
    <section>
        <h1>Heading 1</h1>
        <p>Some text</p>
        <p>Some more text</p>
    </section>
    <section>
        <h1>Heading 2</h1>
        <p>Some text</p>
        <h2>Subheading</h2>
        <p>Some text</p>
    </section>
    <section>
        <h1 id="heading3">Heading 3</h1>
        <p>Some text</p>
    </section>
</article>

The problem is all I have is libxslt1.1 (thus, XSLT 1.0 + EXSLT). With XSLT 2.0 I could've do something with a nice-looking <xsl:for-each-group select="*" group-starting-with="h1"> but, sadly, it's not a viable option for me.

I don't want to group on attribute values (I don't have any meaningful attributes), so, as I understand it, Muenchian grouping is not a trick that would work for me. Maybe I'm wrong, though - I've only read about this method some minutes ago.

Is there any way to achieve this with XSLT 1.0?


Solution

  • as I understand it, Muenchian grouping is not a trick that would work for me.

    Well, something very close to it will:

    XSLT 1.0

    <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    
    <xsl:key name="grpById" match="*[not(self::h1)]" use="generate-id(preceding-sibling::h1[1])" />
    
    <xsl:template match="/article">
        <xsl:copy>
            <xsl:for-each select="h1">
                <section>
                    <xsl:copy-of select=". | key('grpById', generate-id())"/>
                </section>
            </xsl:for-each>
        </xsl:copy>
    </xsl:template>
    
    </xsl:stylesheet>