Search code examples
xpathxslt

XSLT Organising a list of lists


I'm trying to re-work a list of lists, moving from XML to a proprietary XML-based file format.

Essentially the input is something like

<ul>
  <li>Item 1</li>
  <li><ul><li>Sub item 1</li><li>Sub item 2</li></ul></li>

  <li>Item 2</li>
  <li><ul><li>Sub item 3</li><li>Sub item 4</li></ul></li>
</ul>

which obviously looks like this:

  • Item 1
    • Sub item 1
    • Sub item 2
  • Item 2
    • Sub item 3
    • Sub item 4

But I need the sub item lists within the same li tag as their respective header. So, like this:

  • Item 1
    • Sub item 1
    • Sub item 2
  • Item 2
    • Sub item 3
    • Sub item 4

When I test the original input above, I can't seem to come up with an XPATH which will pick up my first sub item ul without picking up the second subitem ul.

When running that through the transform this basically creates

<ul>
  <li>Item 1
    <ul><li>Sub item 1</li><li>Sub item 2</li></ul>
    <ul><li>Sub item 3</li><li>Sub item 4</li></ul>
  </li>

  <li>Item 2
    <ul><li>Sub item 3</li><li>Sub item 4</li></ul>
  </li>
</ul>

The XPath that has got me this far is

following-sibling::li[not(normalize-space(text()))]/*[1][self::ul or self::ol]
  

normalize-space is to isolate an li which don't have text but just ul or ol inside them. I have tried multiple variants of the above, setting the index [1] just returns them all, [2] returns nothing.

I'm a bit stuck and grateful for any input or suggestions!


Solution

  • This seems like a grouping problem you can tackle in XSLT 2 or later (current version of XSLT is 3.0) with for-each-group group-starting-with:

      <xsl:template match="ul[.//ul]">
        <xsl:copy>
          <xsl:for-each-group select="li" group-starting-with="li[not(ul)]">
            <xsl:copy>
              <xsl:apply-templates select="node(), tail(current-group())/*"/>
            </xsl:copy>
          </xsl:for-each-group>
        </xsl:copy>
      </xsl:template>
    

    Example fiddle is here.