Search code examples
c#.netcollectionsicollection

What should `ICollection<T>.Count` return when the collection contains more than 2^31 elements?


I am implementing a custom collection in .NET, and in doing so, am implementing the ICollection<T> interface by proxy. Because of this, I can't actually avoid implementing this interface. Part of the interface's contract is

public interface ICollection<T> : IEnumerable<T>, IEnumerable
{
    Int32 Count { get; }
    // ...
}

My collection will be able to hold more than Int32.MaxValue items (in fact, this is an expected use case), however I'm not sure what the right thing to do is when Count exceeds the range of an Int32.

So what should occur when ICollection<T>.Count is called when the collection contains more than 2^31 elements? Return Int32.MaxValue? Throw an OverflowException? Something else?


Solution

  • There is no “right” answer because the official documentation is silent on this question. Any possible implementation will fail to behave properly (that is, as documented) in at least one scenario. Therefore, the best implementation of ICollection<T>.Count in your case will depend on how Count is used by your code, either directly or indirectly.

    Starting with the obvious, if Count is never used, then it doesn’t matter how you implement the property. You could hide the property with an explicit interface implementation that throws NotImplementedException (serving as a TODO), and revisit this later:

    Int32 ICollection<T>.Count => throw new NotImplementedException();
    

    But if Count is called, here are some advantages and disadvantages of the two options you listed plus a couple more:

    Option 1: If the count exceeds Int32.MaxValue, then throw an exception (such as OverflowException); otherwise, return the actual count.

    public Int32 Count => _count <= Int32.MaxValue
        ? (Int32)_count
        : throw new NotSupportedException();
    
    • Advantage: An exception signals that Count’s documented contract couldn’t be fulfilled and prevents subsequent code from silently operating on an incorrect value and possibly producing garbage output.
    • Disadvantage: Throwing an exception is relatively costly compared to returning a value, and if done frequently, may noticeably degrade performance.
    • Disadvantage: If you throw an exception, you’re likely going to want to catch it. However, while usages of Count in your own code are easy to spot, usages by framework and third-party APIs may be buried. For example, the LINQ .Any() extension method invokes Count to check whether the collection is empty. If you throw an exception only occasionally, it may be difficult to find all such indirect usages ahead of time and wrap them in try...catch blocks or refactor your code to avoid calling these APIs.

    Option 2: If the count exceeds Int32.MaxValue, then return a sentinel value (such as Int32.MaxValue); otherwise, return the actual count.

    public Int32 Count => _count < Int32.MaxValue
        ? (Int32)_count
        : Int32.MaxValue;
    

    This is pretty much the opposite of Option 1.

    • Advantage: Returning a value allows some existing code (like the LINQ .Any() extension method) to work no matter how many items are in the collection.
    • Advantage: Returning a value is relatively cheap compared to throwing an exception.
    • Disadvantage: If you return a sentinel value, you’re likely going to want to handle it as a special case. However, while usages of Count in your own code are easy to spot, usages by framework and third-party APIs may be buried. It may be difficult to predict whether such code will behave properly when given your sentinel value—now and in the future. (The current implementation of .Any() happens to work because it only checks whether the count is nonzero.)

    Option 3: Unconditionally throw an exception (such as NotSupportedException).

    Int32 ICollection<T>.Count => throw new NotSupportedException();
    
    • Advantage: Unlike Option 1 and Option 2, unconditionally throwing an exception makes it much more obvious (once you run your code) which framework and third-party APIs use Count, allowing you to refactor your code to avoid calling these APIs.
    • Disadvantage: You can’t call such framework and third-party APIs (such as the LINQ .Any() extension method) even when your collection contains no more than Int32.MaxValue items.

    Option 4: Don’t implement ICollection<T>.

    (You ruled out this option in your question. I’m providing it for completeness for anyone else who comes across this question in the future.)

    • Advantage: This option is the only one that adheres to the Liskov Substitution Principle. It avoids all the problems that arise from implementing a broken contract. For example, the LINQ .Any() extension method is guaranteed to work no matter how many items are in the collection.
    • Disadvantage: This option disables some performance optimizations that could be available when the collection contains no more than Int32.MaxValue items. For example, .Any() is forced to always allocate an enumerator.