Search code examples
androidandroid-xmlandroid-pullparser

Android Html.fromHtml() loses the HTML if it starts with <p> tag


i call a web service that returns some HTML which enclosed in an XML envelop... something like:

<xml version="1.0" cache="false">
    <text color="white">
        <p> Some text <br /> <p>
    </text>
</xml>

I use XmlPullParser to parse this XML/HTML. To get the text in element, i do the following:

case XmlPullParser.START_TAG:

    xmlNodeName = parser.getName();

    if (xmlNodeName.equalsIgnoreCase("text")) {
        String color = parser.getAttributeValue(null, "color");
        String text = parser.nextText();

        if (color.equalsIgnoreCase("white")) {

            detail.setDetail(Html.fromHtml(text).toString());

        }
    }
break;

This works well and gets the text or html in element even if it contains some html tags.

Issue arises when the element's data starts with <p> tag as in above example. in this case the data is lost and text is empty.

How can i resolve this?

EDIT

Thanks to Nik & rajesh for pointing out that my service's response is actually not a valid XML & element not closed properly. But i have no control over the service so i cannot edit whats returned. I wonder if there is something like HTML Agility that can parse any type of malformed HTML or can at least get whats in html tags .. like inside <text> ... </text> in my case?? That would also be good.

OR anything else that i can use to parse what i get from the service will be good as long as its decently implementable.

Excuse me for my bad english


Solution

  • Solution

    Isnpired by Martin's approach of converting the received data first to string, i managed my problem in a kind of mixed approach.

    Convert the received InputStream's value to string and replaced the erroneous tag with "" (or whatever you wish) : as follows

    InputStreamReader isr = new InputStreamReader(serviceReturnedStream);
    BufferedReader br = new BufferedReader(isr);
    StringBuilder xmlAsString = new StringBuilder(512);
    String line;
    try {
        while ((line = br.readLine()) != null) {
            xmlAsString.append(line.replace("<p>", "").replace("</p>", ""));
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    

    Now i have a string which contains correct XML data (for my case), so just use the normal XmlPullParser to parse it instead of manually parsing it myself:

    XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
    factory.setNamespaceAware(false);
    XmlPullParser parser = factory.newPullParser();
    parser.setInput(new StringReader(xmlAsString.toString()));
    

    Hope this helps someone!