Search code examples
javaeclipsexml-parsingsax

Umlaut in Java SAX Parser


I am currently having trouble with German umlaut values in a XML document I received.

It displays / saves the value as a "ü" instead of a "ü".

The XML Encoding is set to UTF-8 which should be capable of displaying umlauts.

Also I couldn't find any option to set a locale on the SAX parser.

Is there any other way I can make the values save correctly?

btw: I am using eclipse as IDE.

All help is very appreciated!

Thanks in advance!


Solution

  • The XML is encoded in UTF-8, but you are decoding it with ISO-8859-1.

    Try to use InputStream and other "binary"-oriented APIs for XML. Avoid using a Reader, or trying to convert from byte[] to a String before parsing XML. You are much more likely to mess up the character encoding than the parser is.