Search code examples
c#multithreadinghttpclientsocksproxies

Dynamically change proxy in HttpClient without hard cpu usage


I need to create a multithreaded application which makes requests (Post, get etc) For this purpose i chose Httpclient.

By default it does not support Socks proxies. So I find Sockshandler (https://github.com/extremecodetv/SocksSharp) can be used instead of basic HttpClientHandler. It allows me to use socks.

But I have a problem. All my requests should be send through different proxies which I have parsed from the internet. But httpclient handler doesn't support changing proxies dynamically. If I don't have valid proxy, I need to recreate a httclient, this is ok, but if I have 200 threads, it takes a lot of cpu. So what should I do in this situation?

And second problem. I found this article (https://aspnetmonsters.com/2016/08/2016-08-27-httpclientwrong/) which talks to use HttpClient as a single instance to better performance, but it's impossible in multithreaded program. Which way is better in this case?

Thx for help


Solution

  • httpclient handler doesn't support changing proxies dynamically.

    I'm not sure if that's technically true. Proxy is a read/write property so I believe you could change it (unless that results in a runtime error...I haven't actually tried it to be honest).

    UPDATE: I have tried it now and your assertion is technically true. In the sample below, the line that updates UseProxy will fail with "System.InvalidOperationException: 'This instance has already started one or more requests. Properties can only be modified before sending the first request.'" Confirmed on .NET Core and full framework.

    var hch = new HttpClientHandler { UseProxy = false };
    var hc = new HttpClient(hch);
    var resp = await hc.GetAsync(someUri);
    
    hch.UseProxy = true; // fail!
    hch.Proxy = new WebProxy(someProxy);
    resp = await hc.GetAsync(someUri);
    

    But what is true is that you can't set a different property per request in a thread-safe way, and that's unfortunate.

    if I have 200 threads, it takes a lot of cpu

    Concurrent asynchronous HTTP calls should not consume extra threads nor CPU. Fire them off using await Task.WhenAll or similar and there is no thread consumed until a response is returned.

    And second problem. I found this article...

    That's definitely something you need to look out for. However, even if you could set a different proxy per request, the underlying network stack would still need to open a socket for each proxy, so you wouldn't be gaining anything over an HttpClient instance per proxy in terms of the socket exhaustion problem.

    The best solution depends on just how many proxies you're talking about here. In the article, the author describes running into problems when the server hit around 4000-5000 open sockets, and no problems around 400 or less. YMMV, but if the number of proxies is no more than a few hundred, you should be safe creating a new HttpClient instance per proxy. If it's more, I would look at throttling your concurrency and test it until find a number where your server resources can keep up. In any case, make sure that if you need to make multiple calls to the same proxy, you're re-using HttpClient instances for them. A ConcurrentDictionary could be useful for lazily creating and reusing those instances.