Search code examples
javaxmlxsltxalan

Remove special characters from XML via XSLT only for specific tags


I am having a certian issue with special characters in my XML. Bascially I am splitting up an xml into multiple xmls using Xalan Processor.

When splitting the documents up I am using their value of the name tag as the name of the file generated. The problem is that the name contains characters that arent recognized by the XML processor like ™ (TM) and ® (R). I want to remove those characters ONLY when naming the files.

<xsl:template match="products">
    <redirect:write select="concat('..\\xml\\product\\en\\',translate(string(name),'&lt;/&gt; ',''),'.xml')">

The above is the XSL code I have writter to split the XML into multlpe XMLs. As you can see I am using hte translate method to subtitute '/','<','>' with '' from the name. I was hoping I could do the same with ™ (TM) and ® (R) but it doesnt seem to work. Please advice me how I would be able to do that.

Thanks for you help in advance.


Solution

  • I don't have Xalan, but with 8 other XSLT processors this thransformation:

    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output method="text"/>
    
     <xsl:template match="text()">
       <xsl:value-of select="translate(., '&lt;/&gt;™®', '')"/>
       ===================
       <xsl:value-of select="translate(., '&lt;/&gt;&#x2122;&#xAE;', '')"/>
     </xsl:template>
    </xsl:stylesheet>
    

    when applied on this XML document:

    <t>XXX™ My Trademark®</t>
    

    produces the wanted result:

    XXX My Trademark
       ===================
       XXX My Trademark
    

    I suggest that you try to use one of the two expressions above -- at least the second may work successfully.