Search code examples
androidxml-parsingurl-encodinghtml-encodedomparser

Android: Parsing XML file which contain Html Entities, Html Characters and URL Addresses?


I'm trying to parse XML file (RSS Feed), but I have a problem that the xml file contains HTML Entities Character, and it doesn't appear when I convert it to string and I don't know how to encode it:

public String getXmlFromUrl(String url) {
    String xml = null;
    try {
        DefaultHttpClient httpClient = new DefaultHttpClient();
        HttpPost httpPost = new HttpPost(url);
        HttpResponse httpResponse = httpClient.execute(httpPost);
        HttpEntity httpEntity = httpResponse.getEntity();

        xml = EntityUtils.toString(httpEntity, HTTP.UTF_8);

    } catch (UnsupportedEncodingException e) {} 
    catch (ClientProtocolException e) {} 
    catch (IOException e) {}

For example: This is the text I want to get in my java code

<description>
     Amman Post: Shath'a Hasson pointed on the reason about &nbsp .... .... ...
</description>

But in the string I lose all the text after this character &nbsp

And when i tried to parse a URL Address:

http://www.ammanpost.net/index.php?page=article&id=25981

what I get in the string is this:

http://www.ammanpost.net/index.php?page=article

I lose every thing after '&' character.

Can you help me please ? Thank you.


Solution

  • I had the problem with my app too, I managed to fix it with the Html class like so :

    Html.fromHtml(string); 
    

    for the URL problem, check out the URLDecoder class