Search code examples
javaandroidhtmlregexhtml-escape-characters

How to replace HTML escape characters in JSON String in Android


I am currently developing an android app with server synchronization. I have a php script sending me database information from a server encoded with JSON. Furthermore the german ä, ö, ü, ß and other characters are html escaped with &auml, &ouml, &uuml etc. When receiving the data in my android app I want to decode the escape characters (of course).

I googled about how to replace html escape characters in Android and found HTML.fromHTML(htmlString).toString() and StringEscapeUtils.unescapeHtml3(htmlString) from the org.apache.commons.lang library. However both just did not change the String at all.

An example of the Strings the app is receiving: {"categories":[[0,"Rollen f&uumlr vergrabene Karte"]],"roles":[],"sets":[],"set_roles":[],"teams":[]}

I also tried only to decode inner parts but it did not work either. That would then look like this: [[0,"Rollen f&uumlr vergrabene Karte"]]

How can I decode those characters without having to replaceAll every single one of them?


Solution

  • Html.fromHtml is supposed to do the job.

    However, looking at the json you're showing, apparently the server is wrongly encoding the HTML escapes. The semicolon is not being appended, &uumlr should be &uumlr;

    I've checked using the escape ú and ü and it's working.

    String json = "{\"categories\":[[0,\"Rollen fúoobar tüoo vergrabene Karte\"]],\"roles\":[],\"sets\":[],\"set_roles\":[],\"teams\":[]}";
    
    System.out.println("=======>" + Html.fromHtml(json).toString());