Search code examples
javaandroidcharacter-encodinghttp-post

ByteArrayOutputStream encoding issue


I have issue on encoding, after i download data of rss feed from website. Some character does not interpret properly. I use HttpResponse.getEntity() and put in loop read inputStream and write in ByteArrayOutPutStream.

E.g. ByteArrayOutPutStream bs; after write on "bs", i use

String test = bs.toString("UTF-8");

however some character comes like this:

Mytestï¼è¾å¸éï¼å°±è¢«æèªé²å¥é»å­éµä»¶ç³»çµ±ä¸äºéç¥å®¢æ¶

I cannot convert those character, any idea.

Thank you


Solution

  • It's not in UTF-8 encoding, it's likely in Big5 encoding (your question history confirms that you're from China / Hong Kong).

    Mytest簿翹癡職疇繡矇簿翹疇簞簣癡瞽竄疆癡穠矇簡疇瞼矇罈疇簫矇繕瓣罈繞癟糧罈癟繕簣瓣繡瓣繙矇癟瞼疇簧瞽疆繞

    You should be able to determine that by reading HttpEntity#getContentType() yourself. It should return something like

    text/html;charset=Big5