Search code examples
xmlxsltxslt-2.0

Convert HTML colspan and rowspan to "empty" cells


For a certain output format (not HTML-like), I need to convert HTML tables to 'square' tables, where each colspan and rowspan is not only indicated in the parent cell, but also followed by the correct number of empty cells.

For example, the simple HTML table

<table>
    <tr>
        <th>test</th>
        <th colspan="2">span 1/2</th>
        <th colspan="3">span 2/2</th>
    </tr>
    <tr>
        <td>col 1</td>
        <td>col 2</td>
        <td>col 3</td>
        <td>col 4</td>
        <td>col 5</td>
        <td>col 6</td>
    </tr>
</table>

should be translated to

<table>
    <tr>
        <th>test</th>
        <th colspan="2">span 1/2</th>
        <th />  <!-- < empty cell added -->
        <th colspan="3">span 2/2</th>
        <th />  <!-- < empty cell added -->
    </tr>
    ..

(note: the output format uses a very different syntax, this is for clarity only!)

and, similarly, rowspans should be propagated to next <tr> lines:

<table><tr><td rowspan="3" /><td rowspan="2" /><td /></tr>
    <tr><td>data</td></tr>
    <tr><td>data</td><td>data</td></tr>
</table>

which should come out as

<table>
    <tr><td /><td /><td /></tr>
    <tr><td /><td /><td>data</td></tr>  <!-- 2 empty cells added -->
    <tr><td /><td>data</td><td>data</td></tr>  <!-- 1 empty cell added -->
<table>

Handling colspan is straightforward:

<xsl:template name="add-empty">
    <xsl:param name="repeat" />

    <xsl:if test="$repeat &gt; 1">
        <td class="empty" />
        <xsl:call-template name="add-empty">
            <xsl:with-param name="repeat" select="$repeat - 1" />
        </xsl:call-template>
    </xsl:if>
</xsl:template>

<xsl:template match="th|td">
    <td>
        <xsl:apply-templates />
    </td>
    <xsl:if test="@colspan">
        <xsl:call-template name="add-empty">
            <xsl:with-param name="repeat" select="@colspan" />
        </xsl:call-template>
    </xsl:if>
</xsl:template>

This will add single th or td, check each one's colspan, and insert as many empty cells as needed with a recursive call to the template add-empty. The class attribute empty is for debugging only.

The problem is in the rowspans. For this to work properly, it needs scanning over every previous tr and keep a count of which columns need to be empty. That iteration would be something like

<xsl:if test="position() &gt; 1">
    <xsl:variable name="currentRow" select="position()" />
    <xsl:for-each select="../tr[position() &lt; $currentRow]">
        <xsl:message>testing <xsl:value-of select="." /></xsl:message>
    </xsl:for-each>
</xsl:if>

– it does not need to be called on the first row, because for that only colspans need adding. The question, then, is two-fold: how would I build the cell set list to add up to a correct set for the current row? And with such a list, how can I iterate over both this list (which is as long as the total number of columns in the table) and each row's th|td elements?

The latter is a problem because I can iterate over either the cell set using something like

<xsl:for-each select="1 to string-length(cell-set)">
  <xsl:if test="substring($cell-set, ., 1) = 'E'>
    .. empty ..
    ...
</xsl:for-each>

(if cell-set is a string), or over the 'current' tr contents using

<xsl:for-each select="th|td">
  ..

in which case there is no direct relation to the contents of cell-set. With the first, I don't know which index of td|th to insert, with the second I don't know when to insert a blank.


Solution

  • Based on http://andrewjwelch.com/code/xslt/table/table-normalization.html which I already mentioned in a link you could use:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0" exclude-result-prefixes="xs">
    
        <xsl:output indent="yes" omit-xml-declaration="yes" />
    
        <xsl:variable name="table_with_no_colspans">
            <xsl:apply-templates mode="colspan" />
        </xsl:variable>
    
        <xsl:variable name="table_with_no_rowspans">
            <xsl:for-each select="$table_with_no_colspans">
                <xsl:apply-templates mode="rowspan" />
            </xsl:for-each>
        </xsl:variable>
    
        <xsl:template match="/">
            <xsl:apply-templates select="$table_with_no_rowspans" mode="final" />
        </xsl:template>
    
        <xsl:template match="@*|*" mode="#all">
            <xsl:copy>
                <xsl:apply-templates select="@*|*" mode="#current" />
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match="td | th" mode="colspan">
            <xsl:choose>
                <xsl:when test="@colspan">
                    <xsl:copy>
                        <xsl:copy-of select="@* except @colspan"/>
                        <xsl:apply-templates/>
                    </xsl:copy>
                    <xsl:for-each select="2 to @colspan">
                        <td/>
                    </xsl:for-each>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:copy-of select="." />
                </xsl:otherwise>
            </xsl:choose>
        </xsl:template>
    
        <!-- make sure it works for both table/tr and table/tbody/tr -->
        <xsl:template match="tbody|table[not(tbody)]" mode="rowspan">
            <xsl:copy>
                <xsl:copy-of select="tr[1]" />
                <xsl:apply-templates select="tr[2]" mode="rowspan">
                    <xsl:with-param name="previousRow" select="tr[1]" />
                </xsl:apply-templates>
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match="tr" mode="rowspan">
            <xsl:param name="previousRow" as="element()" />
    
            <xsl:variable name="currentRow" select="." />
    
            <xsl:variable name="normalizedTDs">
                <xsl:for-each select="$previousRow/*">
                    <xsl:choose>
                        <xsl:when test="@rowspan &gt; 1">
                            <xsl:copy>
                                <xsl:attribute name="rowspan">
                                    <xsl:value-of select="@rowspan - 1" />
                                </xsl:attribute><!--
                                <xsl:copy-of select="@*[not(name() = 'rowspan')]" />
                                <xsl:copy-of select="node()" />
                            --></xsl:copy>
                        </xsl:when>
                        <xsl:otherwise>
                            <xsl:copy-of select="$currentRow/*[1 + count(current()/preceding-sibling::*[not(@rowspan) or (@rowspan = 1)])]" />
                        </xsl:otherwise>
                    </xsl:choose>
                </xsl:for-each>
            </xsl:variable>
    
            <xsl:variable name="newRow" as="element(tr)">
                <xsl:copy>
                    <xsl:copy-of select="$currentRow/@*" />
                    <xsl:copy-of select="$normalizedTDs" />
                </xsl:copy>
            </xsl:variable>
    
            <xsl:copy-of select="$newRow" />
    
            <xsl:apply-templates select="following-sibling::tr[1]" mode="rowspan">
                <xsl:with-param name="previousRow" select="$newRow" />
            </xsl:apply-templates>
        </xsl:template>
    
        <xsl:template match="td | th" mode="final">
            <xsl:choose>
                <xsl:when test="@rowspan">
                    <xsl:copy>
                        <xsl:copy-of select="@* except @rowspan" />
                        <xsl:copy-of select="node()" />
                    </xsl:copy>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:copy-of select="." />
                </xsl:otherwise>
            </xsl:choose>
        </xsl:template>
    
    </xsl:stylesheet>