Search code examples
javatype-conversionbinaryfilesbinary-dataraw-data

Is there any data lost in Java if I convert Binary data to a String and back?


I have a messaging class in my java program that only uses String values and never any binary data.

I want to send a rpm file, so basically binary data through this messaging class to a receiver.

I know this can be done by converting the binary data to a String on the messaging end and then back to a binary file on the receiving end.

However my question is, will any data be lost between converting my binary file to a String then back to binary data to save as a binary file, or will the data be retained through all conversions?


Solution

  • Binary data means byte[], InputStream, OutputStream. And java uses internally Unicode for text: String, char, Reader, Writer.

    Hence one should only convert binary data that represents text, and also specify the encoding of that binary data:

    byte[] bytes = ...
    String s = new String(bytes, StandardCharsets.UTF_8);
    bytes = s.getBytes(StandardCharsets.UTF_8);
    

    Non-text data should not be converted, as it may be illegal for the specific encoding, especially for the multibyte encoding UTF-8. Also the conversion to Unicode is an unnecessary inefficiency. For instance java char is two bytes (UTF-16 encoded).

    Better use a ByteArrayInputStream, ByteArrayOutputStream, ByteBuffer for some purposes. Never String. When obstinate, then use StandardCharsets.ISO_8859_1.