Search code examples
javainputstreamfilewriterbufferedwriter

BufferedWriter outputting strange characters when saved to new file


I'm using the following code to process a large text file, line by line. The problem is that I'm using a language other than English, Croatian to be precise. Many of the characters appear as � in the output file. How can I resolve this?

The file is in ANSI, but this does not seem to be an encoding type compatiable with InputStreamReader. What encoding type should I save the original file as?

try (BufferedWriter bw = new BufferedWriter(new FileWriter(FILENAME))) {

 String line;
 try {
  try (
   InputStream fis = new FileInputStream("C:\\Users\\marti\\Documents\\Software Projects\\Java Projects\\TwitterAutoBot\\src\\main\\resources\\EH.Txt"); InputStreamReader isr = new InputStreamReader(fis, Charset.forName("UTF-8")); BufferedReader br = new BufferedReader(isr);
  ) {
   while ((line = br.readLine()) != null) {
    // Deal with the line

    String content = line.substring(line.lastIndexOf("  ") + 1);
    System.out.println(content);

    bw.write("\n\n" + content);

   }
  }
 } catch (IOException e) {
  e.printStackTrace();
 }

 // bw.close();

} catch (IOException e) {

 e.printStackTrace();

}

Solution

  • I solved this by encoding with Cp1252 instead of UTF-8 because the file was encoded in ANSI.