Search code examples
javafileencodingstreamcharacter-encoding

Java : How to determine the correct charset encoding of a stream


With reference to the following thread: Java App : Unable to read iso-8859-1 encoded file correctly

What is the best way to programatically determine the correct charset encoding of an inputstream/file ?

I have tried using the following:

File in =  new File(args[0]);
InputStreamReader r = new InputStreamReader(new FileInputStream(in));
System.out.println(r.getEncoding());

But on a file which I know to be encoded with ISO8859_1 the above code yields ASCII, which is not correct, and does not allow me to correctly render the content of the file back to the console.


Solution

  • I have used this library, similar to jchardet for detecting encoding in Java: https://github.com/albfernandez/juniversalchardet