Search code examples
c#collectionsinitialization

Why does a combination of object and collection initializers use Add method?


The following combination of object and collection initializers does not give compilation error, but it is fundamentally wrong (https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/object-and-collection-initializers#examples), because the Add method will be used in the initialization:

public class Foo
{
    public List<string> Bar { get; set; }
}

static void Main()
{
    var foo = new Foo
    {
        Bar =
        {
            "one",
            "two"
        }
    };
}

So you'll get NullReferenceException. What is the reason for making such an unsafe decision while developing the syntax of the language? Why not to use initialization of a new collection for example?


Solution

  • First, it's not only for combination of object and collection initializers. What you are referring here is called nested collection initializers, and the same rule (or issue by your opinion) applies to nested object initializers. So if you have the following classes:

    public class Foo
    {
        public Bar Bar { get; set; }
    }
    
    public class Bar
    {
        public string Baz { get; set; }
    }
    

    and you use the following code

    var foo = new Foo
    {
        Bar = { Baz = "one" }
    };
    

    you'll get the same NRE at runtime because no new Bar will be created, but attempt to set Baz property of the Foo.Bar.

    In general the syntax for object/collection initializer is

    target = source
    

    where the source could be an expression, object initializer or collection initializer. Note that new List<Bar> { … } is not a collection initializer - it's an object creation expression (after all, everything is an object, including collection) combined with collection initializer. And here is the difference - the idea is not to omit the new, but give you a choice to either use creation expression + object/collection initializer or only initializers.

    Unfortunately the C# documentation does not explain that concept, but C# specification does that in the Object Initializers section:

    A member initializer that specifies an object initializer after the equals sign is a nested object initializer, i.e. an initialization of an embedded object. Instead of assigning a new value to the field or property, the assignments in the nested object initializer are treated as assignments to members of the field or property. Nested object initializers cannot be applied to properties with a value type, or to read-only fields with a value type.

    and

    A member initializer that specifies a collection initializer after the equals sign is an initialization of an embedded collection. Instead of assigning a new collection to the target field, property or indexer, the elements given in the initializer are added to the collection referenced by the target.


    So why is that? First, because it clearly does exactly what you are telling it to do. If you need new, then use new, otherwise it works as assignment (or add for collections).

    Other reasons are - the target property could not be settable (already mentioned in other answers). But also it could be non creatable type (e.g. interface, abstract class), and even when it is a concrete class, except it is a struct, how it will decide that it should use new List<Bar> (or new Bar in my example) instead of new MyBarList, if we have

    class MyBarList : List<Bar> { }
    

    or new MyBar if we have

    class MyBar : Bar { }
    

    As you can see, the compiler cannot make such assumptions, so IMO the language feature is designed to work in the quite clear and logical way. The only confusing part probably is the usage of the = operator for something else, but I guess that was a tradeoff decision - use the same operator = and add new after that if needed.