Search code examples
httphttp-redirectlanguage-agnostic

Is it possible to set 301 in an infinite loop based on the request


Please consider this language agnostic. I really like to know what happens beneath. The code is in C#. The problem is, I am trying to fetch this URL: https://fr.wikipedia.org/wiki/Monast%C3%A8re_d%27Arkadi from code. Whatever I do, I get either "MaximumAutoRedirects exceeded" or "operation timed out".

Sample code in C#, although I see similar results in other languages:

var url = "https://fr.wikipedia.org/wiki/Monast%C3%A8re_d%27Arkadi";
var request = (HttpWebRequest)WebRequest.Create(url);
try
{
    request.CookieContainer = new CookieContainer();
    request.MaximumAutomaticRedirections = Int32.MaxValue;
    request.AllowAutoRedirect = true;
    request.Method = "GET";
    using (WebResponse response = request.GetResponse())
    using (var stream = response.GetResponseStream())
    {
        var reader = new StreamReader(stream);
        var res = reader.ReadToEnd();
    }
}
catch (Exception ex)
{

}

My questions:

  1. Why this auto re-direct is happening?
  2. Is it possible to detect a get from code / browser? ( I don't know how to ask this. I meant that when accessing from Chrome I get the URL. But when performing a GET from code, I get infinite re-direct. So are those GETs any different was my question)

P.S: This question prompted me to ask this. Why Wikipedia returns 301 response code with certain URL?


Solution

  • Is it possible to set 301 in an infinite loop based on the request ?

    Yes. An http server can send you an inifinite scheme. Usually the browser will detect an infinite redirection loop and will stop redirecting. A configuration problem on the server side can create a redirection loop. An HTTP client should be able to detect that and stop after a certain amount of redirections (and request.MaximumAutomaticRedirections = Int32.MaxValue is maybe too big in that case)

    Why is it happening? Well, mistakes. You can write rules in the HTTP server configuration with infinite redirections, the HTTP server is not a compiler, this will not be detected on the server side.

    Or maybe you have a lot of different servers to manage and they do not all get the last configuration at the same time. In the wikipedia case it could be for example with someone fixing the redirections of a page. Say you have:

    • A -> B
    • B
    • C -> B

    And you fix it to:

    • A
    • B -> A
    • C -> A

    If any Reverse proxy cache as registered the A -> B redirection in a temporary cache, or if this is not immediatly available in all database replicas (as a wiki is storing a lot of rule in the application database), then... you could have both A->B and B->A responses on the client side.

    Why this auto re-direct is happening?

    Because you wrote:

    request.AllowAutoRedirect = true;
    

    Note that I'm not a C# expert but it seems legit.

    Is it possible to detect a get from code / browser?

    Not sure what you are asking for.

    EDIT Ok, so your question is how to track the HTTP traffic made by the browser when a simple GET is made, and maybe also how to track it from your code.

    On the browser you have some native tools like the network tab in the development tools, where you can activate the preserve log button to track redirections, but for simple redirection I think you do not even need to preverve log to see the 302 responses followed by other requests.

    If you want to catch all HTTP traffic on your computer you can always use wireshark, which is not so hard. Or you can enforce all your HTTP traffic in your application or browser through a proxy like Fiddler.