Search code examples
c#linqlinq-group

Why do changes made in foreach to a Linq grouping select get ignored unless I add ToList()?


I have the following method.

public IEnumerable<Item> ChangeValueIEnumerable()
    {
        var items = new List<Item>(){
            new Item("Item1", 1),
            new Item("Item2", 1),
            new Item("Item3", 2),
            new Item("Item4", 2),
            new Item("Item5", 3)
        };

        var groupedItems = items.GroupBy(i => i.Value)
            .Select(x => new Item(x.First().Name, x.Key));

        foreach (var item in groupedItems)
        {
            item.CalculatedValue = item.Name + item.Value;
        }

        return groupedItems;
    }

Into the groupedItems collection the CalculatedValues are null. However if I add a ToList() to the Select sentence after the GroupBy the CalculatedValues has values. for example:

 var groupedItems = items.GroupBy(i => i.Value)
            .Select(x => new Item(x.First().Name, x.Key)).ToList();

So, the question is. Why is this? I want to know the reason for this, the solution for me is add a ToList()

Update: The defition of Item class is the following

 public class Item
{
    public string Name { get; set; }
    public int Value { get; set; }

    public string CalculatedValue { get; set; }

    public Item(string name, int value)
    {
        this.Name = name;
        this.Value = value;
    }
}

Solution

  • var groupedItems = items.GroupBy(i => i.Value)
        .Select(x => new Item(x.First().Name, x.Key));
    

    Here, groupedItems doesn't actually hold any items. The IEnumerable<T> returned by Select represents a computation - to be precise, it represents the result of mapping items to a new set of items by applying the function x => new Item(x.First().Name, x.Key).

    Each time you iterate over groupedItems, the function will be applied and a new set of items will be created.

    var groupedItems = items.GroupBy(i => i.Value)
        .Select(x => 
        {
            Console.WriteLine("Creating new item");
            return new Item(x.First().Name, x.Key));
        }
    
    foreach(var item in groupedItems);
    foreach(var item in groupedItems);
    

    This code, for example, will print "Creating new item" twice for each item in items.

    In your code, you're setting the CalculatedValue of an ephemeral item. When the foreach loop is done, the items are gone.

    By calling ToList, you're turning the "computation" into an actual collection of items.

    Instead of calling ToList, you could alternatively create another computation that represents a new set of items with their CalculatedValue property set. This is the functional way.

    Func<Item, Item> withCalculatedValue =
        item => {
            item.CalculatedValue = item.Name + item.Value;
            return item;
        };
    
    return items.GroupBy(i => i.Value)
            .Select(x => new Item(x.First().Name, x.Key))
            .Select(withCalculatedValue);
    

    Or simply use object initializers

    return items.GroupBy(i => i.Value)
            .Select(x => new Item(x.First().Name, x.Key) { CalculatedValue = x.First().Name + x.Key });
    

    If you want to do a little bit more research on the topic of objects that hold computations, google the term "Monad", but be prepared to be confused.