Summary: I want to save a org.w3c.dom.Document
to file with nice indentation (pretty print it). The below code with a Transformer
does the job in some cases, but not in all cases (see example). Can you help me fix this?
I have a org.w3c.dom.Document
(not org.jdom.Document
) and want to automatically format it nicely and print it into a file. How can I do that? I tried this, but it doesn't work if there are additional newlines in the document:
import java.io.ByteArrayInputStream;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
public class Main {
public static void main(String[] args) {
try {
String input = "<asdf>\n\n<a>text</a></asdf>";
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new ByteArrayInputStream(input.getBytes()));
System.out.println("-- input -------------------\n" + input + "\n----------------------------");
System.out.println("-- output ------------------");
prettify(doc);
System.out.println("----------------------------");
} catch (Exception e) {}
}
public static void prettify(Document doc) {
try {
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
transformer.transform(new DOMSource(doc), new StreamResult(System.out));
} catch (Exception e) {}
}
}
I have directed the ouput to System.out
so that you can run it easily wherever you want (for instance on Ideone.com). You can see, that the output is not pretty. If I remove the \n\n
from the input string, everything is fine. And the document usually doesn't come from a string, but from a file and gets modified heavily before I want to prettify it.
This Transformer seems to be the right way, but I am missing something. Can you tell me, what I am doing wrong?
SSCCE output:
-- input -------------------
<asdf>
<a>text</a></asdf>
----------------------------
-- output ------------------
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<asdf>
<a>text</a>
</asdf>
----------------------------
Expected output:
-- input -------------------
<asdf>
<a>text</a></asdf>
----------------------------
-- output ------------------
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<asdf>
<a>text</a>
</asdf>
----------------------------
Try this:
It needs org.apache.xml.serialize.XMLSerializer
and org.apache.xml.serialize.OutputFormat
;
OutputFormat format = new OutputFormat(document); //document is an instance of org.w3c.dom.Document
format.setLineWidth(65);
format.setIndenting(true);
format.setIndent(2);
Writer out = new StringWriter();
XMLSerializer serializer = new XMLSerializer(out, format);
serializer.serialize(document);
String formattedXML = out.toString();