I'm trying to write a file containing some German characters to disk and read it using Windows-1252
encoding. I don't understand why, but my output is like this:
<title>W�hrend und im Anschluss an die Exkursion stehen Ihnen die Ansprechpartner f�r O-T�ne</title>
<p>Die Themen im �berblick</p>
Any thoughts? Here is my code. You'll need spring-core and commons-io to run it.
private static void write(String fileName, Charset charset) throws IOException {
String html = "<html xmlns=\"http://www.w3.org/1999/xhtml\">" +
"<head>" +
"<meta http-equiv=\"Content-Type\" content=\"text/html; charset=windows-1252\">" +
"<title>Während und im Anschluss an die Exkursion stehen Ihnen die Ansprechpartner für O-Töne</title>" +
"</head>" +
"<body>" +
"<p>Die Themen im Überblick</p>" +
"</body>" +
"</html>";
byte[] bytes = html.getBytes(charset);
FileOutputStream outputStream = new FileOutputStream(fileName);
OutputStreamWriter writer = new OutputStreamWriter(outputStream, charset);
IOUtils.write(bytes, writer);
writer.close();
outputStream.close();
}
private static void read(String file, Charset windowsCharset) throws IOException {
ClassPathResource pathResource = new ClassPathResource(file);
String string = IOUtils.toString(pathResource.getInputStream(), windowsCharset);
System.out.println(string);
}
public static void main(String[] args) throws IOException {
Charset windowsCharset = Charset.forName("windows-1252");
String file = "test.txt";
write(file, windowsCharset);
read(file, windowsCharset);
}
Your write method is wrong. You are using a writer to write bytes. A writer should be used for writing characters or strings.
You already encoded the string into bytes with the line
byte[] bytes = html.getBytes(charset);
These bytes can simply be written into an output stream:
IOUtils.write(bytes, outputStream);
This makes the writer unnecessary (remove it) and you will now get the correct output.