Search code examples
c#asp.netrssfull-text-searchfeed

how to Read Full Text RSS Feed


Some sites can get full text Rss feed when the rss address don't have full text like this site This

how can I do that?


Solution

  • I don't know much about C#, but I can still give a general answer on how to solve your problem. RSS feeds (almost) always link to the article, hosted on the newspaper/blog's website, where the whole article is available. So the "RSS filler" takes the content of the article from the website content and basically puts it back in the feed, replacing the available (short) intro.

    To achieve this you need to:

    • parse/generate RSS/Atoms feeds (I'm sure there are plenty of C# libs to do that)
    • find the actual article from the html page linked in the original RSS feed. Indeed the linked page contains a lot of things you don't want to put in the "full" RSS feed (such as the website header, nav bar, ads, comments, facebook like button and so on). The easiest way to do this is to use readability (a quick google check gives this lib).

    If you combine both of these, you can achieve your goal.

    You can find one implementation of this kind of tool at http://fivefilters.org, and their source code (for older versions) is at /content-only/ http://code.fivefilters.org/full-text-rss/. It's in PHP, but it can give a rough idea on how to proceed.