Search code examples
c#multithreadingconcurrencyasp.net-core-webapiparallel.foreach

Parallel.ForEach not adding items as expected in ConcurrentBag in C#


In my Asp.Net Core WebApi Controller, I'm receiving a IFormFile[] files. I need to convert this to of List<DocumentData>. I first used foreach. It was working fine. But later decided to change to Parallel.ForEach as I'm receiving many(>5) files.

Here is my DocumentData Class:

public class DocumentData
{
    public byte[] BinaryData { get; set; }
    public string FileName { get; set; }
}

Here is my Parallel.ForEach Logic:

var documents = new ConcurrentBag<DocumentData>();
Parallel.ForEach(files, async (currentFile) =>
{
    if (currentFile.Length > 0)
    {
        using (var ms = new MemoryStream())
        {
            await currentFile.CopyToAsync(ms);
            documents.Add(new DocumentData
            {
                BinaryData = ms.ToArray(),
                FileName = currentFile.FileName
            });
        }
    }
});

For Example, even for two files as input, documents always gives one file as output. Am I missing something?

I initially had List<DocumentData>. I found that it's not thread safe and changed to ConcurrentBag<DocumentData>. But still I'm getting unexpected results. Please assist on where I'm wrong?


Solution

  • I guess it is because, Parallel.Foreach doesn't support async/await. It only takes Action as input and executes it for each item. And in case of async delegates it will execute them in a fire-and-forget manner. In that case passed lambda will be considered as async void function and async void can't be awaited.

    If there were overload which takes Func<Task> then it would work.

    I suggest you to create Tasks with the help of Select and use Task.WhenAll for executing them at the same time.

    For example:

    var tasks = files.Select(async currentFile =>
    {
        if (currentFile.Length > 0)
        {
            using (var ms = new MemoryStream())
            {
                await currentFile.CopyToAsync(ms);
                documents.Add(new DocumentData
                {
                    BinaryData = ms.ToArray(),
                    FileName = currentFile.FileName
                });
            }
        }
    });
    
    await Task.WhenAll(tasks);
    

    Additionally you can improve that code with just returning DocumentData instance from that method, and in such case there is no need to modify documents collection. Task.WhenAll has overload which takes IEnumerable<Task<TResult> as input and produces Task of TResult array. So, the result will be so:

    var tasks = files.Select(async currentFile =>
        {
            if (currentFile.Length > 0)
            {
                using (var ms = new MemoryStream())
                {
                    await currentFile.CopyToAsync(ms);
                    return new DocumentData
                    {
                        BinaryData = ms.ToArray(),
                        FileName = currentFile.FileName
                    };
                }
            }
    
            return null;
        });
    
    var documents =  (await Task.WhenAll(tasks)).Where(d => d != null).ToArray();