Search code examples
androidrssxmlpullparser

XmlPullPaser skip one Tag for Rss Reader in Android


I am developing a reader for an rss feed wordpress. The problem is that it is picking up the image of the gravatar and should not.

public class RssParser {

    public List<RssItem> parse(InputStream inputStream) throws XmlPullParserException, IOException {
        try {
            XmlPullParser parser = Xml.newPullParser();
            parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, false);
            parser.setInput(inputStream, "UTF-8");
            parser.nextTag();
            return readFeed(parser);
        } finally {
            inputStream.close();
        }
    }

    private List<RssItem> readFeed(XmlPullParser parser) throws IOException, XmlPullParserException {
        List<RssItem> items = new ArrayList<>();
        boolean insideItem = false;
        String imageUrl = null;
        parser.require(XmlPullParser.START_TAG, null, "rss");
        while (parser.next() != XmlPullParser.END_DOCUMENT) {
            if (parser.getEventType() != XmlPullParser.START_TAG) {
                continue;
            }
            String name = parser.getName();

            if (name.equals("item")) {
                insideItem = true;
            } else if (name.equals("media:content")) {
                if (insideItem)
                    imageUrl = readImage(parser);
            }

            if (imageUrl != null) {
                RssItem item = new RssItem(imageUrl);
                items.add(item);
                imageUrl = null;
            }
        }
        return items;
    }

    private String readImage(XmlPullParser parser) throws IOException, XmlPullParserException {
        parser.require(XmlPullParser.START_TAG, null, "media:content");
        return parser.getAttributeValue(null, "url");
    }
}

I wonder how I can do to skip this "media: content" containing gravatar image.

Example here as a part of my rss code.

<media:content url="https://1.gravatar.com/avatar/7d261705b92edb50eaca05ed63ca453e?s=96&#38;d=identicon&#38;r=G" medium="image">
    <media:title type="html">renangueiros</media:title>
</media:content>

<media:content url="https://correntesproinfo.files.wordpress.com/2015/08/duvidas.jpg?w=300" medium="image">
    <media:title type="html">duvidas</media:title>
</media:content>

I wish my code skip the first tag media:content that contains the image of gravatar, and only return the second that contains url of the image I want use.


Solution

  • If the urls to the gravatar image are those which starts with https://1.gravatar.com/avatar/ then you can just use the following code:

      if (name.equals("item")) {
            insideItem = true;
      } else if (name.equals("media:content")) {
            if (insideItem) {
                 imageUrl = readImage(parser);
                 if ((imageUrl != null) && ! imageUrl.startsWith("https://1.gravatar.com/avatar/") {
                     RssItem item = new RssItem(imageUrl);
                     items.add(item);             
                 }
            }
      }
    

    Separately, please note that your insideItem variable was not reset to false. You may wish to check for the END_TAG of <item> and do so.