Search code examples
xmlxsltxslt-1.0tei

Transforming TEI p into lg


I have TEI (text encoding initiative) document containing

<div>
  <p>
     some text, and maybe nodes <note>A note</note><lb />
     and some more text<lb />
     final line without lb
  </p>
</div>

and I want to transform it to:

<div>
  <lg>
     <l>some text, and maybe nodes <note>A note</note></l>
     <l>and some more text</l>
     <l>final line without lb</l>
  </lg>
</div>

Transforming the p to lg is trivial by using

<xsl:template match="tei:div/tei:p">
   <lg>
     <xsl:apply-templates/>
   </lg>
</xsl:template>

But the rest can't i figure out how to do. Turning a sequence of nodes into children of a new parent.

If there is a solution for xslt 1.0 would it be great.


Solution

  • You could use a technique called Muenchian grouping here. In this case, you can group the child nodes of a p element by the number of lb elements that follow them

    <xsl:key name="p-nodes" match="tei:p/node()" use="concat(generate-id(..), '|', count(following-sibling::tei:lb))" />
    

    To get the first node in each group, which will represent each l you want outputted, you would select them like so...

    <xsl:for-each
         select="node()[generate-id() = generate-id(key('p-nodes', concat($parentId, '|', count(following-sibling::tei:lb)))[1])]">
    

    And to output the <l> tag itself and the contents of the group, use the key again...

    <l><xsl:apply-templates select="key('p-nodes', concat($parentId, '|', count(following-sibling::tei:lb)))[not(self::tei:lb)]" /></l>
    

    Try this XSLT (obviously changing the namespace for the tei prefix to match the real one in your XML)

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:tei="tei">
        <xsl:output method="xml" indent="yes" />
    
        <xsl:key name="p-nodes" match="tei:p/node()" use="concat(generate-id(..), '|', count(following-sibling::tei:lb))" />
    
        <xsl:template match="tei:div/tei:p">
           <lg>
                <xsl:variable name="parentId" select="generate-id()" />
                <xsl:for-each select="node()[generate-id() = generate-id(key('p-nodes', concat($parentId, '|', count(following-sibling::tei:lb)))[1])]">
                    <l><xsl:apply-templates select="key('p-nodes', concat($parentId, '|', count(following-sibling::tei:lb)))[not(self::tei:lb)]" /></l>
                </xsl:for-each>
           </lg>
        </xsl:template>
    
        <xsl:template match="@*|node()">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
        </xsl:template>
    </xsl:stylesheet>
    

    See it in action at http://xsltransform.net/gWEamMf