Search code examples
javacharacter-encodingnon-ascii-characters

Java's charsets / character encoding


I have a file in Spanish so it's full of characters like:

 á é í ó ú ñ Ñ Á É Í Ó Ú 

I have to read the file, so I do this:

fr = new FileReader(ficheroEntrada);
BufferedReader rEntrada = new BufferedReader(fr);

String linea = rEntrada.readLine();
if (linea == null) {
logger.error("ERROR: Empty file.");
return null;
} 
String delimitador = "[;]";
String[] tokens = null;

List<String> token = new ArrayList<String>();
while ((linea = rEntrada.readLine()) != null) {
    // Some parsing specific to my file. 
    tokens = linea.split(delimitador);
    token.add(tokens[0]);
    token.add(tokens[1]);
}
logger.info("List of tokens: " + token);
return token;

When I read the list of tokens, all the special characters are gone and have been replaced by this kind of characters:

Ó = Ó
Ñ = Ñ

And so on...

What's happening? I had never had problems with the charsets (I'm assuming is a charset issue). Is it because of this computer? What can I do?

Any extra advice will be appreciated, I'm learning! Thank you!


Solution

  • You need to specify related character encoding.

    BufferedReader rEntrada  = new BufferedReader(
        new InputStreamReader(new FileInputStream(fr), "UTF-8"));