Search code examples
javafile-ioutf-16

File written in UTF-16 can't be opened, but UTF-8 can


So I'm supposed to insert information in a file with UTF-16 encoding, than do some operations (count lines, words, etc). Problem is that if I choose the UTF-16 encoding, an exception is thrown, but the UTF-8 works fine.

import java.io.*;
import java.util.Scanner;

public final class Q4 {
    public static void main(String[ ] args)throws FileNotFoundException{
        final String ENCODING = "UTF-16";
        final String FILE = " testcount";
        PrintWriter out = null;
// Given code – do not modify(!) This will create the UTF-16 test file on your drive.
        try {
            out = new PrintWriter(FILE, ENCODING);
            out.write("Test file for UTF-16\n" + "(contains surrogate pairs:\n" +
                    "Musical symbols in the range 1D100–1D1FF)\n\n");
            out.write("F-clef (1D122): \uD834\uDD22\tCrotchet (1D15F): \uD834\uDD5F\n");
            out.write("G-clef (1D120): \uD834\uDD20\tSemiquaver (1D161): \uD834\uDD61\n");
            out.write("\n(? lines, ?? words, ??? chars but ??? code points)\n");
        } catch (IOException e) { System.out.println("uh? cannot write to file!");
        } finally { if (out != null) out.close(); 
        }
// Your code – scan the test file and count lines, words, characters, and code points.

        Scanner fin = new Scanner(new File(FILE));
        String s = "";

        //get the data in file
        while (fin.hasNext()){
            s = s + fin.next();      
            System.out.println(s);            
        }

        fin.close();

        //count words and lines



    }
}

My only guess, a far fetched one, is that it has to something to do with the OS (windows 8.1) not being able to save a UTF- 16 code, but sounds like a silly guess.


Solution

  • Specify the encoding when you read the file:

    Scanner fin = new Scanner(new File(FILE), ENCODING);