Search code examples
scalaencodingdecodingiso-8859-1windows-1252

Bridge difference between Windows-1252 and ISO-8859-1


I am having trouble with character encoding in Scala.

The Scala app i am working on connects to Database which is encoded in Windows-1252

But the encoding for Scala app is ISO-8859-1

I cannot change these encodings.

Because of this there are some unknown and miscoded characters when a row is read from the DB and begins processing in the Scala code.

Setting the system file.encoding variable did not work.

This almost worked and fixed some of the characters, but not all of them :

new String(databaseStringValue.getBytes("ISO-8859-1"), "Windows-1252")

And when i try this :

private val encoder: CharsetEncoder = Charset.forName("Windows-1252").newEncoder()
...
val cp1252Buffer = encoder.encode(CharBuffer.wrap(databaseStringValue))

I get the UnmappableCharacter error.

Please, help.


Solution

  • This is impossible.

    There are characters in Windows-1252 that don't exist in ISO 8859-1, therefore it is impossible to map Windows-1252 to ISO 8859-1.