Search code examples
xsltubl

XSLT transformation: remove tags with specific names


I've made the request and received the proper answer: https://stackoverflow.com/a/78638030/23979545

Now I have to add one condition to the previous request:

Initial xml file:

<root>
    <_D>urn:oasis:names:specification:ubl:schema:xsd:Invoice-2</_D>
    <_A>urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2</_A>
    <_B>urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2</_B>
    <Invoice>
        <ID>
            <_>INV12345</_>
        </ID>
        <InvoiceTypeCode>
            <_>01</_>
            <listVersionID>1.0</listVersionID>
        </InvoiceTypeCode>
        <InvoicePeriod>
            <StartDate>
                <_>2017-11-26</_>
            </StartDate>
            <EndDate>
                <_>2017-11-30</_>
            </EndDate>
        </InvoicePeriod>
        <AccountingSupplierParty>
            <AdditionalAccountID>
                <_>id-1234</_>
                <schemeAgencyName>name1</schemeAgencyName>
            </AdditionalAccountID>
        </AccountingSupplierParty>
    </Invoice>
</root>

Final xml file:

<Invoice>
    <ID>NV12345</ID>
    <InvoiceTypeCode listVersionID="1.0">01</InvoiceTypeCode>
    <InvoicePeriod>
        <StartDate>2017-11-26</StartDate>
        <EndDate>2017-11-30</EndDate>
    </InvoicePeriod>
    <AccountingSupplierParty>
        <AdditionalAccountID schemeAgencyName="name1">id-1234</AdditionalAccountID>
    </AccountingSupplierParty>
</Invoice>

It's necessary to do:

  • remove all tags starting with "_" and length > 1 (like <_D>, <_A>, <_B>) - can't properly add this condition;
  • remove root tag (it's done);
  • remove all tags <_>, but their values move to their parent tags (it's done);
  • all others non <_> tags move as attributes to their parent tags (it's done).

It's the solution provided by @y.arazim for the last three steps:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="*">
        <xsl:copy>
            <xsl:copy-of select="@*"/>

            <!-- convert leaf child elements (except _) to attributes -->
            <xsl:for-each select="*[not(* or self::_)]">
                <xsl:attribute name="{name()}">
                    <xsl:value-of select="."/>
                </xsl:attribute>
            </xsl:for-each>

            <!-- process other child elements and text nodes -->
            <xsl:apply-templates select="*[*] | _ | text() "/> 
        </xsl:copy>
    </xsl:template>

    <!-- remove root and _ elements -->
    <xsl:template match="/* | _">
        <xsl:apply-templates/>
    </xsl:template>


</xsl:stylesheet>

Solution

  • remove all tags starting with "_" and length > 1

    Just match them using an empty template:

    <xsl:template match="*[starts-with(name(), '_') and string-length(name()) > 1]"/>