Search code examples
c#.net-corewebclient

Program runs differently without debugger


I have a program (.NET 5) that downloads a bunch of files (1k+) simultaneously using WebClient.DownloadFile, which works as expected while running with the debugger, downloading around 99% of the files in both Debug and Release mode; but when running without the debugger it fails to download more than 50% of the files.

All the threads finish before the program ends as they are foreground threads.

The code of the program is:

using System;
using System.IO;
using System.Net;
using System.Threading;

namespace Dumper
{
    internal sealed class Program
    {
        private static void Main(string[] args)
        {
            Directory.CreateDirectory(args[1]);

            foreach (string uri in File.ReadAllLines(args[0]))
            {
                string filePath = Path.Combine(args[1], uri.Split('/')[^1]);

                new Thread((param) =>
                {
                    (string path, string url) = ((string, string))param!;
                    using WebClient webClient = new();

                    try
                    {
                        webClient.DownloadFile(new Uri(url.Replace("%", "%25")), path);

                        Console.WriteLine($"{path} has been successfully download.");
                    }
                    catch (UriFormatException)
                    {
                        throw;
                    }
                    catch (Exception e)
                    {
                        Console.WriteLine($"{path} failed to download: {e}");
                    }
                }).Start((filePath, uri));
            }
        }
    }
}

Solution

  • Your problem has little to do with debugging, however there are many issues with your code in general. Here is a more sane approach which will wait for all the downloads to complete.

    Note : You could also use Task.WhenAll, however I have chosen to use a TPL Dataflow ActionBlock in case you need manage the degree of parallelism

    Given

    private static readonly HttpClient _client = new();
    
    private static string _basePath;
    
    private static async Task ProcessAsync(string input)
    {
       try
       {
          var uri = new Uri(Uri.EscapeUriString(input));
    
          var filePath = Path.Combine(_basePath, input.Split('/')[^1]);
    
          using var result = await _client
             .GetAsync(uri)
             .ConfigureAwait(false);
    
          // fail fast
          result.EnsureSuccessStatusCode();
    
          await using var fileStream = new FileStream(filePath, FileMode.Create, FileAccess.Write, FileShare.None, 1024 * 1024, FileOptions.Asynchronous);
    
          await using var stream = await result.Content
             .ReadAsStreamAsync()
             .ConfigureAwait(false);
    
          await stream.CopyToAsync(fileStream)
             .ConfigureAwait(false);
    
          Console.WriteLine($"Downloaded : {uri}");
    
       }
       catch (Exception e)
       {
          Console.WriteLine(e);
       }
    }
    

    Usage

    private static async Task Main(string[] args)
    {
       var file = args.ElementAtOrDefault(0) ?? @"D:\test.txt";
       _basePath = args.ElementAtOrDefault(1) ?? @"D:\test";
    
       Directory.CreateDirectory(_basePath);
    
       var actionBlock = new ActionBlock<string>(ProcessAsync,new ExecutionDataflowBlockOptions()
       {
          EnsureOrdered = false,
          MaxDegreeOfParallelism = -1 // set this if you think the site is throttling you
       });
    
       foreach (var uri in File.ReadLines(file))
          await actionBlock.SendAsync(uri);
    
       actionBlock.Complete();
       // wait to make sure everything is completed
       await actionBlock.Completion;
    
    }