I am getting crazy with UTF-8 decoding of some URLs. I am using
URLDecoder.decode (java.net.URLDecoder)
to decode some URLs with special chars. As you can see below for some location names in the URL the decode works and for some it does not ...
biha%C4%87 --> biha? (WRONG)
d%C3%A9partement+morbihan --> département morbihan (CORRECT)
gespanschaft+me%C4%91imurje --> gespanschaft me?imurje (WRONG)
hajd%C3%BA+bihar --> hajdú bihar (CORRECT)
Any Ideas? would highly appriciate! Thom
Using URLDecoder.decode(url, "UTF-8")
all your URLs are decoded correctly.
However the decoded strings of case 1 and 3 contain characters with codepoint 263 and 273.
Most likely you printed these strings to a console which cannot print characters with codepoints > 255 and which replaces those with a ?
.