Search code examples
xmlxpathnamespacesxmlstarletxmllint

Delete namespace from xmlstarlet output


Background

Looking to extract elements from the following XML content:

<ui:composition xmlns="http://www.w3.org/1999/xhtml"
                xmlns:h="http://java.sun.com/jsf/html"
                xmlns:f="http://java.sun.com/jsf/core"
                xmlns:ui="http://java.sun.com/jsf/facelets">
    <h:inputText id="id"/>
    ...
</ui:composition>

Extraction

All h:inputText elements can be selected using:

xmlstarlet sel -t -c "//h:inputText" filename.xml

Problem

This produces the following namespace-infested output:

<h:inputText
    xmlns="http://www.w3.org/1999/xhtml"
    xmlns:h="http://java.sun.com/jsf/html"
    xmlns:f="http://java.sun.com/jsf/core"
    xmlns:ui="http://java.sun.com/jsf/facelets" id="id"/>

Question

How can the namespaces be suppressed from the output?

Ideas

Use regular expressions to post-process; however:

  • sed doesn't have a non-greedy match;
  • perl is too heavyweight (and would require a complex regex).

Pipe through xmllint or xmlstarlet for a second pass, but that requires a well-formed XML document.

Using xmllint poses its own set of namespace problems.

Produce a document comprised of only ui:composition and h:inputText elements:

<ui:composition
    xmlns="http://www.w3.org/1999/xhtml"
    xmlns:h="http://java.sun.com/jsf/html"
    xmlns:f="http://java.sun.com/jsf/core"
    xmlns:ui="http://java.sun.com/jsf/facelets">
  <h:inputText id="id"/>
  <h:inputText id="id"/>
</ui:composition>

This is tricky because the h:inputText elements can occur at any depth of the document.


Solution

  • You could use XSLT. If you want to output h:inputText as-is, you won't be able to suppress the namespace declaration binding the prefix h: to the uri http://java.sun.com/jsf/html.

    XSLT 1.0

    Create input.xsl:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
      xmlns:h="http://java.sun.com/jsf/html">
      <xsl:output omit-xml-declaration="yes"/>
      <xsl:strip-space elements="*"/>
    
      <xsl:template match="/">
        <xsl:apply-templates select="//h:inputText"/>
      </xsl:template>
    
      <xsl:template match="h:inputText">
        <xsl:copy>
          <xsl:copy-of select="@*"/>
        </xsl:copy>
      </xsl:template>
    
    </xsl:stylesheet>
    

    xmlstarlet command

    xmlstarlet tr input.xsl filename.xml
    

    Output

    <h:inputText xmlns:h="http://java.sun.com/jsf/html" id="id"/>
    

    You could output inputText in no namespace though...

    XSLT 1.0

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
      xmlns:h="http://java.sun.com/jsf/html" exclude-result-prefixes="h">
      <xsl:output omit-xml-declaration="yes"/>
      <xsl:strip-space elements="*"/>
    
      <xsl:template match="/">
        <xsl:apply-templates select="//h:inputText"/>
      </xsl:template>
    
      <xsl:template match="h:inputText">
        <inputText>
          <xsl:copy-of select="@*"/>
        </inputText>
      </xsl:template>
    
    </xsl:stylesheet>
    

    Output

    Using same command line above:

    <inputText id="id"/>
    

    Note: You might need to add <xsl:text>&#xA;</xsl:text> after </xsl:copy> (or </inputText> in the second example) to explicitly add a newline. Otherwise xmlstartlet might output all the elements on a single line. (It did for me using xmlstarlet 1.6.1 and indent="yes" on xsl:output didn't help.)

    JSF Output

    Since JSF is involved, consider:

    <xsl:stylesheet version="1.0"
                    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
                    xmlns:h="http://java.sun.com/jsf/html"
                    xmlns:f="http://java.sun.com/jsf/core"
                    xmlns:c="http://java.sun.com/jsp/jstl/core"
                    xmlns:ui="http://java.sun.com/jsf/facelets"
                    xmlns:a4j="http://richfaces.org/a4j"
                    exclude-result-prefixes="h f c ui a4j">
        <xsl:output method="xml" omit-xml-declaration="yes" />
        <xsl:strip-space elements="*"/>
    
        <xsl:template match="/">
            <h:html>
                <xsl:apply-templates select="//h:inputText"/>
            </h:html>
            <xsl:text>&#xA;</xsl:text>
        </xsl:template>
    
        <xsl:template match="h:inputText">
            <xsl:text>&#xA;</xsl:text>
            <h:inputText>
                <xsl:copy-of select="@*"/>
            </h:inputText>
            <xsl:text>&#xA;</xsl:text>
        </xsl:template>
    </xsl:stylesheet>