I have a set of strings (String
objects) in Java and would like to write them to a file so that I can later retrieve them.
I understand that Java uses UTF-16 to store strings internally. I am worried that I might muck something up due to formatting issues unless I write and read the strings properly. I do not want to dump the String objects raw to the file as I would like to be able to open the file in a standard text editor to look at it, where each string is shown on its own line in a sensible way (assuming no string contains a line break).
Can I simply use say the PrintWriter
class with the println(String x)
method (assuming there are no line breaks in the strings), combined with the Scanner class's nextLine()
method when reading them back? Would this guarantee that I get the exact same strings back?
Further, suppose the strings do contain line breaks, what is the appropriate way of writing them then? Should I filter out line breaks (replacing them with some ad-hoc escape code or similar) and use the println
method with PrintWriter
as above?
For completeness I am answering my own question with the solution I eventually adopted. In retrospect the solution is very straightforward. Duh!
To write the strings I use the BufferedWriter
class which has conventient methods for writing strings. The BufferedWriter is obtain through:
writer = new BufferedWriter(
new OutputStreamWriter(
new FileOutputStream(filename), "UTF-8"));
Here I have specified the UTF-8 encoding which is supported by basically everything.
To read the strings back I use the BufferedReader
class and make sure to use the UTF-8 encoding:
reader = new BufferedReader(
new InputStreamReader(
new FileInputStream(filename), "UTF-8"));