Search code examples
javautf-8randomaccessfile

how to write UTF8 data to xml file using RandomAccessFile?


When trying to write some UTF8 data to a file, I end up with some garbage in the file. The code is as follows

public static boolean saveToFile(StringBuffer buffer,
                                   String fileName,
                                   ArrayList exceptionList,
                                   String className)
  {
    log.debug("In saveToFile for file [" + fileName + "]");

                RandomAccessFile raf = null;
                File file = new File(fileName);
                File backupFile = new File(fileName+"_bck");

                try
                {
                    if (file.exists())
                    {
                            if (backupFile.exists())
                            {
                            backupFile.delete();
                            }
                            file.renameTo(backupFile);
                    }
                    raf = new RandomAccessFile(file, "rw");
                    raf.writeBytes(buffer.toString());
                    raf.close();

The output of buffer.toString() is

<?xml version="1.0" encoding="UTF-8"?>
<ivr>
<version>1.1</version>
<templateName>αβγδεζη

The data in the file however is

<?xml version="1.0" encoding="UTF-8"?>
<ivr>
<version>1.1</version>
<templateName>▒▒▒▒▒▒▒</templateName>

How can I make sure that data i nthe file itself is UTF8


Solution

  • I'm not surpised you get garbage:

     raf.writeBytes(buffer.toString())
    

    The documentation for RandomAccessFile.writeBytes(String) says (emphasis added):

    Writes the string to the file as a sequence of bytes. Each character in the string is written out, in sequence, by discarding its high eight bits.

    In a few circumstances, that operation will result in a correctly encoded file. But in most it won't. That writeBytes() method is a foolish design by the Java developers. You need to correctly encode your text as bytes in UTF-8, and then write those bytes.

    Do you really need to operate on the file as a random access file. If not, just manipulate it with a Writer wrapping an OutputStream.

    You could use Charset.encode(CharBuffer) to produce a ByteBuffer holding the encoded bytes, then write those bytes to the file:

     raf.write(StandardCharsets.UTF_8.encode(buffer).array());