Search code examples
javaxmlstax

Simple modification of XML with StAX


I trying to modify an existing XML file. As far as I know, it's not possible to do it directly on the fly, so my idea was to read the file in a stream, modify it, create a new file and just replace the old file by the new file.
I only need to apply simple changes, so I decided to go by the StAX approach, as it's better for large amounts of data or simple processing.

Existing XML file:

<?xml version='1.0' encoding='UTF-8'?>
<Company>
    <Employee>
        <FirstName>Tanmay</FirstName>
        <LastName>Patil</LastName>
        <ContactNo>1234567890</ContactNo>
        <Address>
            <City>Bangalore</City>
            <State>Karnataka</State>
            <Zip>560212</Zip>
        </Address>
    </Employee>
</Company>

Desired output:

<?xml version='1.0' encoding='UTF-8'?>
<Company>
    <Employee>
        <FirstName>Tanmay</FirstName>
        <LastName>Patil</LastName>
        <ContactNo>1234567890</ContactNo>
        <Address>
            <City>Bangalore</City>
            <State>Karnataka</State>
            <NewElem>Some value</NewElem> <!-- Replacing all ZIP-elements -->
        </Address>
    </Employee>
</Company>

This code simply duplicates an XML file (source):

public static void writeAll(XMLStreamReader xmlr, XMLStreamWriter writer)
        throws XMLStreamException {
    while (xmlr.hasNext()) {
        write(xmlr, writer);
        xmlr.next();
    }
    write(xmlr, writer); // write the last element
    writer.flush();
}

public static void write(XMLStreamReader xmlr, XMLStreamWriter writer) throws XMLStreamException {
    switch (xmlr.getEventType()) {
        case XMLEvent.START_ELEMENT:
            final String localName = xmlr.getLocalName();
            final String namespaceURI = xmlr.getNamespaceURI();
            if (namespaceURI != null && namespaceURI.length() > 0) {
                final String prefix = xmlr.getPrefix();
                if (prefix != null) {
                    writer.writeStartElement(prefix, localName, namespaceURI);
                } else {
                    writer.writeStartElement(namespaceURI, localName);
                }
            } else {
                writer.writeStartElement(localName);
            }

            for (int i = 0, len = xmlr.getNamespaceCount(); i < len; i++) {
                writer.writeNamespace(xmlr.getNamespacePrefix(i), xmlr.getNamespaceURI(i));
            }

            for (int i = 0, len = xmlr.getAttributeCount(); i < len; i++) {
                String attUri = xmlr.getAttributeNamespace(i);
                if (attUri != null) {
                    writer.writeAttribute(attUri, xmlr.getAttributeLocalName(i), xmlr.getAttributeValue(i));
                } else {
                    writer.writeAttribute(xmlr.getAttributeLocalName(i), xmlr.getAttributeValue(i));
                }
            }
            break;
        case XMLEvent.END_ELEMENT:
            writer.writeEndElement();
            break;
        case XMLEvent.SPACE:
        case XMLEvent.CHARACTERS:
            writer.writeCharacters(xmlr.getTextCharacters(), xmlr.getTextStart(), xmlr.getTextLength());
            break;
        case XMLEvent.PROCESSING_INSTRUCTION:
            writer.writeProcessingInstruction(xmlr.getPITarget(), xmlr.getPIData());
            break;
        case XMLEvent.CDATA:
            writer.writeCData(xmlr.getText());
            break;
        case XMLEvent.COMMENT:
            writer.writeComment(xmlr.getText());
            break;
        case XMLEvent.ENTITY_REFERENCE:
            writer.writeEntityRef(xmlr.getLocalName());
            break;
        case XMLEvent.START_DOCUMENT:
            String encoding = xmlr.getCharacterEncodingScheme();
            String version = xmlr.getVersion();
            if (encoding != null && version != null) {
                writer.writeStartDocument(encoding, version);
            } else if (version != null) {
                writer.writeStartDocument(xmlr.getVersion());
            }
            break;
        case XMLEvent.END_DOCUMENT:
            writer.writeEndDocument();
            break;
        case XMLEvent.DTD:
            writer.writeDTD(xmlr.getText());
            break;
    }
}

It works, but I'm not sure about the complexity of the write()-method. Are those switch cases really necessary?


Also I had problems, replacing the ZIP-Elements with

while (reader.hasNext()) {
    write(reader, writer);
    reader.next();
    if (reader.getEventType() == XMLStreamReader.START_ELEMENT) {
        String elementName = reader.getLocalName();
        if (elementName.contains("ZIP")) {
                writer.writeStartElement("newElem");
                writer.writeAttribute("atr", "val");
                writer.writeEndElement();
        }
    }
}

What is the most efficient way, to replace some nodes in an XML file?


Solution

  • XSLT has so called Identity Transformation pattern.

    Useful link: XSL Identity Transforms

    The XSLT below will copy the entire input XML as-is with the exception of the Zip element. The moment the Zip element is found, it will be replaced with the new desired tag.

    All you have to do is just to call XSLT transformation from your Java code.

    Input XML

    <?xml version="1.0" encoding="UTF-8"?>
    <Company>
        <Employee>
            <FirstName>Tanmay</FirstName>
            <LastName>Patil</LastName>
            <ContactNo>1234567890</ContactNo>
            <Address>
                <City>Bangalore</City>
                <State>Karnataka</State>
                <Zip>560212</Zip>
            </Address>
        </Employee>
    </Company>
    

    XSLT

    <?xml version="1.0"?>
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output indent="yes" method="xml" encoding="utf-8"/>
    
        <!-- IdentityTransform -->
        <xsl:template match="/ | @* | node()">
            <xsl:copy>
                <xsl:apply-templates select="@* | node()"/>
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match="Zip">
            <NewElem>Some value</NewElem>
        </xsl:template>
    
    </xsl:stylesheet>
    

    Output XML

    <?xml version='1.0' encoding='utf-8' ?>
    <Company>
      <Employee>
        <FirstName>Tanmay</FirstName>
        <LastName>Patil</LastName>
        <ContactNo>1234567890</ContactNo>
        <Address>
          <City>Bangalore</City>
          <State>Karnataka</State>
          <NewElem>Some value</NewElem>
        </Address>
      </Employee>
    </Company>