Search code examples
javaxmldomstack-overflow

XML Document Transform StackOverflowError


I'm building an XML document from scratch. I wrote a class to insert and traverse the elements into the XML. This is it:

import java.io.StringWriter;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;

public class XMLParser {

    private Document doc;
    private Node currentNode;

    public XMLParser(String path) {

        try {
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();

            doc = builder.newDocument();
            currentNode = doc.createElement("Root");
            ((Element) currentNode).setAttribute("Path", path);
            doc.appendChild(currentNode);


        } catch (ParserConfigurationException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

    public XMLParser add(String name){
        currentNode = currentNode.appendChild(doc.createElement(name));

        return this;
    }

    public XMLParser attr(String name, String value){
        ((Element) currentNode).setAttribute(name, value);

        return this;
    }

    public XMLParser set(String value){
        currentNode = currentNode.appendChild(doc.createTextNode(value));

        return this;
    }

    public XMLParser up(){
        currentNode = currentNode.getParentNode();

        return this;
    }

    public String toXML(){
        final Transformer transformer;
        try {
            transformer = TransformerFactory.newInstance().newTransformer();
        } catch (final TransformerConfigurationException ex) {
            throw new IllegalStateException(ex);
        }
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "3");
        final StringWriter writer = new StringWriter();
        try {
            transformer.transform(
                new DOMSource(doc),
                new StreamResult(writer)
            );
        } catch (final TransformerException ex) {
            throw new IllegalArgumentException(ex);
        }
        return writer.toString();
    }


}

Now I can make calls like "new XMLParser("").add("A").up().add("B").toXML();" to generate a String containing the XML Code.

This works great for smaller Documents, but if the xml is getting to large, I get an Stackoverflow-Error at transformer.transform(...) in toXML():

Exception in thread "main" java.lang.StackOverflowError
at com.sun.org.apache.xml.internal.serializer.ToStream.characters(Unknown Source)
at com.sun.org.apache.xml.internal.serializer.ToUnknownStream.characters(Unknown Source)
at com.sun.org.apache.xml.internal.serializer.ToUnknownStream.characters(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
...[1000 lines in console buffer displayed]

Does anyone have an idea on how to minimize the stack overhead? I don't want to fiddle with the JVM settings, because I want it to run on different computers as well...

Thanks in advance for your ideas or solutions!

Edit: Here is the recursive Code that gets called with an Java File Object of an directory that traverses the folders inside to build the xml-tree out of the contents of the directory

private void stepAhead(File f) throws IOException {
    if(f.isDirectory()){
        File[] ergebnis = f.listFiles();
        ArrayList<File> dateien = new ArrayList<File>();
        for (File temp : ergebnis) {
            if (temp.isDirectory()){
                //System.out.println("Checking Directory: " + temp.getCanonicalFile());
                xml.add("Dir").attr("Name", temp.getName());
                stepAhead(temp);
                xml.up();
            }
            else if(temp.isFile()){
                dateien.add(temp);
            }
            else{
                throw new RuntimeException("Keine Datei und kein Ordner");
            }

        }
        for(File temp : dateien){
            BasicFileAttributes attributes = Files.readAttributes(temp.toPath(), BasicFileAttributes.class);

            xml.add("File");
            xml.add("Name").set(temp.getName()).up();
            xml.add("Size").set(""+attributes.size()).up();
            xml.add("DateCreated").set(formatFileTime(attributes.creationTime())).up();
            xml.add("DateLastModified").set(formatFileTime(attributes.lastModifiedTime())).up();
            xml.up();
        }
    }
}

xml is the XMLParser object and i just initialze it in the constructor


Solution

  • There is a problem with XMLParser.set(String value) and how you build the directory document:

    XMLParser.set(String value) appends a text node and sets the current node to that text node.

    Given your builder code in stepAhead you going down one level for the element and another level for the text node, but you are only going up one level.

    xml.add("Name").set(temp.getName()).up();
    

    The output is an incredible deep nested document when applied to deep directories full of files. The Transformer implementation uses recursion to traverse the document and then a stackoverflow occurs.

    If you change XMLParser.set(String value) to

    public XMLParser set(String value){
        currentNode.appendChild(doc.createTextNode(value));
        return this;
    }
    

    everything works ok and the outputs look good. (You should have seen the screwed output when printing the result for a simple directory!)

    Anyway XMLParser is a smart class, taking the pain out of DOM manipulation. Maybe it would better be named XMLBuilder.