Search code examples
phpfeedsimplepie

Most important items from a feed to save in a DB


I'm building a feed aggregator. I have a lot of blog addresses and I want to save all the posts in a database. I'm using Simplepie to get feeds and then with php I'm inserting them into a database. Simplepie's API is very large, and I don't know which parts of the post I should save. So far I'm saving:

  • ID
  • Title
  • Date
  • Permalink
  • Author
  • Description
  • Content

I want to know what other things I should save in the database. I don't know the whole API and it's pretty long.


Solution

  • You could just take a reverse engineer approach.

    Pick a handful of feeds that you already have and view the url source to see what tags are in the feed. After checking a few you should be able to determine what most feeds contain and decide how to setup your db.

    For example this is a feed url:

    feeds.feedburner.com/webresourcesdepot?format=xml

    You can put that in your browser, view source, and then find a section with feed content and determine what tags they have.

    You could also go through the RSS specs and take a look at all the valid options for a valid RSS feed.

    http://cyber.law.harvard.edu/rss/rss.html