Search code examples
javautf-8character-encodingiso-8859-1

Converting UTF-8 encoded with Latin-1 codepage to iso-8859-1 bytes


It's been a long day. I'm not sure if I'm overlooking something, or if there isn't an easy answer to my problem.

Here's my Scenario:

  • I am sending text data as Bytes to a system that does not support UTF-8 encoding.
  • It has a custom character set, but I only need characters that match the encoding of ISO-8859-1 / Latin-1.
  • I have incoming UTF-8 encoded String data that only uses ASCII and a small number of foreign characters that are only from the Latin-1 code page.

In my attempts to re-encode these Strings, I end up with either '?' replacements for the foreign characters, the 2nd Unicode Byte, or both Unicode bytes being sent.

Is there a simple way to take this incoming data that uses 2 bytes to describe these Latin-1 code page characters and encode them as ISO-8859-1 Bytes?


Solution

  • On the reader side you need something like:

    new InputStreamReader(underlyingInputStream, "UTF-8")
    

    On the writer side:

    new OutputStreamWriter(underlyingOutputStream, "ISO-8859-1")
    

    Then you should be able to read the incoming UTF-8 encoded characters, and write them as ISO-8859-1 encoded characters.