StAX parser is converting the double quotes around attributes to single quotes in the data model used by XMLEventReader
. This is fine, but If I want to print back the XML, perhaps selecting only a fragment of the original XML, the output will not be the same.
Input file:
<root>
<mySubTrees>
<mySubTree>
<a property="target">
<aa>123</aa>
</a>
<b>456</b>
<c>789</c>
</mySubTree>
</mySubTrees>
</root>
Code:
@Test
public void test_getXmlFragment() throws Exception {
byte[] fileContent = getXMLBytes();
String xmlFragment = "";
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLEventReader eventReader = factory.createXMLEventReader(new ByteArrayInputStream(fileContent));
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
xmlFragment += event;
}
System.out.println(xmlFragment);
}
private byte[] getXMLBytes() throws IOException {
InputStream inputStream = this.getClass().getResource(PREFIX_XML_FILES + "/sss.xml").openStream();
byte[] fileContent = new byte[inputStream.available()];
inputStream.read(fileContent);
inputStream.close();
return fileContent;
}
Output:
<?xml version="null" encoding='UTF-8' standalone='no'?>
<root>
<mySubTrees>
<mySubTree>
<a property='target'>
<aa>123</aa>
</a>
<b>456</b>
<c>789</c>
</mySubTree>
</mySubTrees>
</root>
Desired Output:
<?xml version="null" encoding="UTF-8" standalone="no"?>
<root>
<mySubTrees>
<mySubTree>
<a property="target">
<aa>123</aa>
</a>
<b>456</b>
<c>789</c>
</mySubTree>
</mySubTrees>
</root>
Is there any way how to fine-tune this?
No. There is no difference between an attribute wrapped in single quotes or in double quotes, and it is an unreasonable requirement to demand a difference between the two.
StAX's job is not to preserve the XML file syntax it is reading. StAX is a parser, its job is to relay the data model expressed in the XML it is reading. And it is doing this job perfectly.
A requirement like yours is likely to force you to write your own XML library, because you shouldn't have this requirement in the first place.