Search code examples
httprequestrssatom-feed

How to get the feed URL(s) from a website?


As per the official documentation, properly setup websites should indicate the URL of their RSS / Atom feed(s) when asked politely:

GET / HTTP/1.1
Host: example.com
Accept: application/rss+xml, application/xhtml+xml, text/html

When an HTTP server (or server-side script) gets this, it should redirect the HTTP client to the feed. It should do this with an HTTP 302 Found. Something like:

HTTP/1.1 302 Found
Location: http://example.com/feed

I'm trying to get this response, without luck:

request(
  { method: 'GET',
    url: 'https://stackoverflow.com',
    followRedirect :false,
    accept: ['application/rss+xml', 'application/xhtml+xml', 'text/html']
  }, function (error, response, body) {
    console.log('statusCode: ', response.statusCode);
  }
);

Yelds

statusCode: 200

How do I formulate my request so that the website responds with the feed URL(s)?


Solution

  • It is not common practice for websites to send back their RSS feed from an HTTP request to the home page asking for an application/rss+xml MIME type in the Accept header. That documentation on Mozilla you've linked is a suggestion I've never seen before after many years involvement in RSS as a developer.

    A more established and widely adopted method for a site to identify its RSS feed is a technique called RSS Autodiscovery. Open the site's home page and look for this tag in the HEAD section:

    <link rel="alternate" type="application/rss+xml" title="RSS"
        href="http://feeds.example.com/rss-feed">
    

    The type attribute can be any of the MIME types for RSS, Atom or JSONFeed feeds.