Search code examples
androidrssdomparser

android dom parser issue


i have this rss feed to parse that contains several tags. i am able to retrieve the value (child element) for all except for the description tag node. please find below the rss feed

<fflag>0</fflag>
<tflag>0</tflag>
<ens1:org>C Opera Production</ens1:org>
−
<description>
<p>Opera to be announced</p>

<p>$15 adults/$12 seniors/$10 for college students<span style="white-space: pre;"> </span></p>
</description>

the code that i am using for this is

    StringBuffer descriptionAccumulator = new StringBuffer();

else if (property.getNodeName().equals("description")){
                    try{
                        String desc = (property.getFirstChild().getNodeValue());
                        if(property.getNodeName().equals("p")){
                            descriptionAccumulator.append(property.getFirstChild().getNodeValue());
                        }
                    }
                    catch(Exception e){
                        Log.i(tag, "No desc");
                    }
else if (property.getNodeName().equals("ens1:org")){
                try{

                        event.setOrganization(property.getFirstChild().getNodeValue());
                        Log.i(tag,"org"+(property.getFirstChild().getNodeValue()));
                    }
                    catch(Exception e){

                    }
else if (property.getNodeName().equals("area")||property.getNodeName().equals("fflag") || property.getNodeName().equals("tflag") || property.getNodeName().equals("guid")){
                    try{
                        //event.setOrganization(property.getFirstChild().getNodeValue());
                        Log.i(tag,"org"+(property.getFirstChild().getNodeValue()));
                    }
                    catch(Exception e){

                    }
else if(property.getNodeName().equals("p") || property.getNodeName().equals("em") || property.getNodeName().equals("br") || property.getNodeName().startsWith("em") || property.getNodeName().startsWith("span") || property.getNodeName().startsWith("a") || property.getNodeName().startsWith("div")  || property.getNodeName().equals("div")  || property.getNodeName().startsWith("p")){
                    descriptionAccumulator.append(property.getFirstChild().getNodeValue());
                    descriptionAccumulator.append(".");
                    System.out.println("description added:"+descriptionAccumulator);
                    Log.i("Description",descriptionAccumulator+property.getFirstChild().getNodeValue());


                }

I tried capturing the value of <description> tag but that dint work out, so I tried using all the usual html formatting tags that are used but still no way out. using any other parser is not an option. could some body please help me out with this. thanks


Solution

  • I believe smth is wrong with the rss xml. For instance check what xml is returned by StackOverflow rss feed. Specifically pay attention how <summary type="html"> node content looks like - it has no child xml nodes inside, only pure xml-escaped text. So if it is acceptable in your case - spend efforts on a proper rss xml generation rather than on fixing the consequences.