Search code examples
c#asynchronousasync-awaitfile-copying

How to copy files async (or parallel)?


I'm trying to learn using async/await methods in C#. I have the following simple task: Copy 2 files from one Windows-PC to another (both one the same local network). In reality, there can be about 1000 files, but for simplicity, I reduce it to two. My code:

using System.Diagnostics;
public static class Program
{
    public static async Task Main()
    {
        var destinationPath = @"path\to\destination\folder";
        List<string> filePaths = new()
        {
            @"\\remote-pc\c$\files\file1",
            @"\\remote-pc\c$\files\file2",
        };
        
        var watch = Stopwatch.StartNew();
        List<Task> tasks = new List<Task>();
        foreach (string path in filePaths)
        {
            Console.WriteLine($"Start copy {Path.GetFileName(path)}");
            using var sourceStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read, bufferSize: 4096, useAsync: true);
            using var destinationStream = new FileStream(Path.Combine(destinationPath, Path.GetFileName(path)), FileMode.CreateNew, FileAccess.Write, FileShare.None, bufferSize: 4096, useAsync: true);
            var task = sourceStream.CopyToAsync(destinationStream).ContinueWith(_ => {
                Console.WriteLine($"End copy {Path.GetFileName(path)}");
            });
            tasks.Add(task);
        }
        
        await Task.WhenAll(tasks);
        watch.Stop();
        var elapsedMs = watch.ElapsedMilliseconds;
        Console.WriteLine(elapsedMs);
    }
}

If I understand correctly, async / await must be used in I / O-bound operations - just this case (or am I mistaken?). When executing the

sourceStream.CopyToAsync(destinationStream)

line, the calling thread will be freed to perform the next operation (apparently this is true, so when running at the same time there are two WriteLine's about the beginning of copying files). However, by analyzing the code execution time (elapsedMs), which is ~30 s. for two files, I conclude that the files are not copied in parallel at all. When you start copying each file separately, the execution time is ~20 s. and ~6 s. for each file respectively. Therefore, with "parallel copying" I expect the total execution time = the time of copying the largest file.

Please help me understand my reasoning.


Solution

  • It is not really true that the total time taken will be the size of the largest file, because in most cases you will bottleneck the reading or writing disks. You may still get better performance, but you might want to limit the maxmimium number of parallel copies.

    Either way, your code is not correct. Do not use ContinueWith, use await instead. And don't dispose the streams outside the task, otherwise the copying may fail, instead make an async lambda and dispose inside it.

    foreach (string path in filePaths)
    {
        Console.WriteLine($"Start copy {Path.GetFileName(path)}");
        tasks.Add(Task.Run(async () =>
        {
            using var sourceStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read, bufferSize: 4096, useAsync: true);
            using var destinationStream = new FileStream(Path.Combine(destinationPath, Path.GetFileName(path)), FileMode.CreateNew, FileAccess.Write, FileShare.None, bufferSize: 4096, useAsync: true);
            await sourceStream.CopyToAsync(destinationStream);
            Console.WriteLine($"End copy {Path.GetFileName(path)}");
        });
    }
    
    await Task.WhenAll(tasks);