So I have a value in my database which has a non breaking space in the form  
in it. I have a legacy service which reads this string from the database and creates an XML using this string. The issue I am facing is that the XML returned for this message is un-parseable. When I open it in notepad++ I see the character xA0
in the place of the non breaking space, and on removing this character the XML becomes parseable. Furthermore I have older revisions of this XML file from the same service which have the character "Â "
in place of the non breaking space. I recently changed the tomcat server on which the service was running, and something has gone wrong because of it. I found this post according to which my XML is encoded to ISO-8859-1;
but the code which I use to convert the XML to string does not use ISO-8859-1;
. Below is my code
private String nodeToString(Node node) {
StringWriter sw = new StringWriter();
try {
Transformer t = TransformerFactory.newInstance().newTransformer();
t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
t.transform(new DOMSource(node), new StreamResult(sw));
} catch (TransformerException te) {
LOG.error("Exception during String to XML transformation ", te);
}
return sw.toString();
}
I want to know why is my XML un-parseable and why is there a "Â "
in the older revisions of the XML file.
Here is the image of the problematic character in notepad++ image in notepad++
Also when I open my XML in notepad and try to save it I see the encoding type is ANSI, when I change it to UTF-8 and then save it the XML becomes parseable.
New Info - Enforcing UTF-8 with transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
did not work I am still getting the xA0 in my XML.
The issue was that my version of java was somehow saving my file in ANSI file format. I saw this when I opened my file in notepad, and tried to save it. The older files were in UTF-8
format. So all I did was specify UTF-8
encoding while writing my file.
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream(fileName.trim()), StandardCharsets.UTF_8));
try {
out.write(data);
} finally {
out.close();
}