Search code examples
c#.netlinqienumerableparallel.foreachasync

Parallel.ForEachAsync does not change properties of IEnumerable items


I have the following class

class Person
{
    public required string Name { get; set; }
    public required string Lastname { get; set; }

    public override string ToString() => $"{Name} - {Lastname}";
}

Then I am creating collection of Persons:

 // dummy data
IEnumerable<Person> persons = Enumerable.Range(0, 1000).Select(index => new Person()
{
    Name = index.ToString(),
    Lastname = index.ToString() + "^"
});

After this, If I try to modify items using Parallel.ForEachAsync The actual Persons inside the IEnumerable will not change

await ChangeByParallel(persons); // does not change the items inside IEnumerable

static async Task ChangeByParallel(IEnumerable<Person> persons)
{
    await Parallel.ForEachAsync(persons, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount },
        async (person, token) =>
        {
            person.Name = "Ermalo";
            person.Lastname = "Magradze";
            await Task.Delay(100);
        });
}

I found out, that If I turn IEnumerable into List<Person> the problem is fixed and the data inside List is changed.

IEnumerable<Person> persons = Enumerable.Range(0, 1000).Select(index => new Person()
{
    Name = index.ToString(),
    Lastname = index.ToString() + "^"
}).ToList(); // ToList  added, So that underlying collection is list now.

await ChangeByParallel(persons); // working

static async Task ChangeByParallel(IEnumerable<Person> persons)
{
    await Parallel.ForEachAsync(persons, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount },
        async (person, token) =>
        {
            person.Name = "Ermalo";
            person.Lastname = "Magradze";
            await Task.Delay(100);
        });
}

Even thou I found the solution, still do not understand why ToList fix this problem.

I know, that IEnumerable is Lazy loaded, but when debugging this code, I see that parallel code is executed. Also, I do not use struct to think that items of the IEnumerable are copied.

So, I wonder why Parallel.ForEachAsync do not change items of IEnumerable.


Solution

  • Enumerable.Range is not repeatable; it will re-do everything (creating new objects etc) every time it is iterated. We can test this:

    var seq = Enumerable.Range(1241241, 1).Select(i => i.ToString());
    var a = seq.Single();
    var b = seq.Single();
    Console.WriteLine($"{a == b}, {ReferenceEquals(a, b)}");
    

    which outputs True, False because different instances of the string have been created for the two iterations (via Single()).

    More generally: any IEnumerable<T> may or may not be repeatable. Lists, arrays, etc: generally will be repeatable (as long as you don't change the data). More complex sequences - for example, reading from a socket, or some external data source, or a RNG: not so much.

    Adding .ToList() means the original sequence is only iterated once, and buffered into a list, which then is repeatable. Adding .ToList() to the first line of the test above gives us True, True.

    Note: the .Select by itself is also not repeatable; if you have a list, and then create a sequence via list.Select(...): that sequence will also re-do all the projection logic every time it is iterated. And so on, for most LINQ operations.