Search code examples
javaxmlencodingsax

SAX parser breaking on ñ


I have implemented a SAX parser in Java by extending the default handler. The XML has a ñ in its content. When it hits this character it breaks. I print out the char array in the character method and it simply ends with the character before the ñ. The parser seems to stop after this as no other methods are called even though there is still much more content. ie the endElement method is never called again. Has anyone run into this problem before or have any suggestion on how to deal with it?


Solution

  • What's the encoding on the file? Make sure the file's encoding decloration matches it. Your parser may be defaulting to ascii or ISO-8859-1. You can set the encoding like so

    <?xml version="1.0" encoding="UTF-8"?>
    

    UTF-8 will cover that character, just make sure that's what the file actually is in.