Search code examples
c#webclientweb-client

WebClient isn't downloading the right file from the supplied URL


I want to download a .torrent file from a Linux distro, but for some reason the final file downloaded from my app is different from the one downloaded manually. The one that my app downloads has 31KB and it is a invalid .torrent file, while right one (when i download manually) has 41KB and it is valid.

The URL from the file i want to download is http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent

Why is it happening and how can i download the same file (the valid one, with 41KB)?

Thanks.


C# Code from the method that downloads the file above:

        string sLinkTorCache = @"http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent";
        using (System.Net.WebClient wc = new System.Net.WebClient())
        {
            var path = @"D:\Baixar automaticamente"; // HACK Pegar isso dos settings na versão final
            var data = Helper.Retry(() => wc.DownloadData(sLinkTorCache), TimeSpan.FromSeconds(3), 5);
            string fileName = null;

            // Try to extract the filename from the Content-Disposition header
            if (!string.IsNullOrEmpty(wc.ResponseHeaders["Content-Disposition"]))
            {
                fileName = wc.ResponseHeaders["Content-Disposition"].Substring(wc.ResponseHeaders["Content-Disposition"].IndexOf("filename=") + 10).Replace("\"", "");
            }

            var torrentPath = Path.Combine(path, fileName ?? "Arch Linux Distro");

            if (File.Exists(torrentPath))
            {
                File.Delete(torrentPath);
            }

            Helper.Retry(() => wc.DownloadFile(new Uri(sLinkTorCache), torrentPath), TimeSpan.FromSeconds(3), 5);
        }

Helper.Retry (Try to execute the method again in case of HTTP Exceptions):

    public static void Retry(Action action, TimeSpan retryInterval, int retryCount = 3)
    {
        Retry<object>(() =>
        {
            action();
            return null;
        }, retryInterval, retryCount);
    }

    public static T Retry<T>(Func<T> action, TimeSpan retryInterval, int retryCount = 3)
    {
        var exceptions = new List<Exception>();

        for (int retry = 0; retry < retryCount; retry++)
        {
            try
            {
                if (retry > 0)
                    System.Threading.Thread.Sleep(retryInterval); // TODO adicionar o Using pro thread
                return action();
            }
            catch (Exception ex)
            {
                exceptions.Add(ex);
            }
        }

        throw new AggregateException(exceptions);
    }

Solution

  • I initially though the site was responding with junk if it thought it was a request from a bot (that is, it was checking some of the headers). After having a look with Fiddler - it appears that the data returned is exactly the same for both a web browser and the code. Which means, we're not properly deflating (extracting) the response. It's very common for web servers to compress the data (using something like gzip). WebClient does not automatically deflate the data.

    Using the answer from Automatically decompress gzip response via WebClient.DownloadData - I managed to get it to work properly.

    Also note that you're downloading the file twice. You don't need to do that.

    Working code:

    //Taken from above linked question
    class MyWebClient : WebClient
    {
        protected override WebRequest GetWebRequest(Uri address)
        {
            HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
            request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
            return request;
        }
    }
    

    And using it:

    string sLinkTorCache = @"http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent";
    using (var wc = new MyWebClient())
    {
      var path = @"C:\Junk";
      var data = Helper.Retry(() => wc.DownloadData(sLinkTorCache), TimeSpan.FromSeconds(3), 5);
      string fileName = "";
    
      var torrentPath = Path.Combine(path, fileName ?? "Arch Linux Distro.torrent");
    
      if (File.Exists(torrentPath))
          File.Delete(torrentPath);
    
        File.WriteAllBytes(torrentPath, data);
    }