I am using Feedzirra to parse my RSS feeds and it works very well -- it is twice as fast Feed Normalizer in my initial testing. More importantly, it has nice wrappers that check for updated entries inside a feed. When I was using its feed-update approach, I ran into some issues:
require 'feedzirra'
feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/TechCrunch")
puts feed.etag #outputs the right tag
The above code prints the correct ETag (checked with Firebug). Now, when I want to check for updates, feedzirra asks you for current etags, last-modified date, etc. When I give it the right ETag, it says there are no updates - that's great. However, if I don't specify an ETag, it does not grab the latest ETag after it grabs all the feeds. That's an issue because if an update happens and I have a stale ETag, I will never be able to grab the current ETag short of calling fetch_and_parse - a waste of another fetch.
feed_to_update = Feedzirra::Parser::Atom.new
feed_to_update.feed_url = "http://feeds.feedburner.com/TechCrunch"
feed_to_update.etag = nil
feed_to_update.last_modified = nil
last_entry = Feedzirra::Parser::AtomEntry.new
last_entry.url = nil
feed_to_update.entries = [last_entry]
updated_feed = Feedzirra::Feed.update(feed_to_update)
puts updated_feed.updated?
puts updated_feed.etag
The above example is a modified version that is part of the documentation from the author: http://gist.github.com/132671. I also tried to give a previous ETag value and it does not get updated - I chose to use nil in the above code because the ETags change frequently for Techcrunch.
The output I get is:
true
#note the above line is blank (basically printing nil)
Am I doing something wrong and using the functions incorrectly in any way? or is this a bug with the program? Any other suggestions on how to look for updated feeds?
Btw, I also tried just using the 'last-modified-date' value and it always thinks there are new entries even if the date matches with the header response.
Thanks, -e
update: In the output I had incorrectly typed in 25 above the blank line. I have fixed that now. sorry.
I looked at the source code and found that etag was not being properly updated. So this seems to fix it:
After the line below (in add_feed_to_multi() of feed.rb)
feed.update_from_feed(updated_feed)
I added this line:
feed.etag = updated_feed.etag
I still have not found a way to resolve last_modified issues but for now etags are working.