Search code examples
c#.netgenericsimmutability

Best Practice List/Array/ReadOnlyCollection creation (and usage)


My code is littered with collections - not an unusual thing, I suppose. However, usage of the various collection types isn't obvious nor trivial. Generally, I'd like to use the type that's exposes the "best" API, and has the least syntactic noise. (See Best practice when returning an array of values, Using list arrays - Best practices for comparable questions). There are guidelines suggesting what types to use in an API, but these are impractical in normal (non-API) code.

For instance:

new ReadOnlyCollection<Tuple<string,int>>(
    new List<Tuple<string,int>> {
        Tuple.Create("abc",3),
        Tuple.Create("def",37)
    }
)

List's are a very common datastructure, but creating them in this fashion involves quite a bit of syntactic noise - and it can easily get even worse (e.g. dictionaries). As it turns out, many lists are never changed, or at least never extended. Of course ReadOnlyCollection introduces yet more syntactic noise, and it doesn't even convey quite what I mean; after all ReadOnlyCollection may wrap a mutating collection. Sometimes I use an array internally and return an IEnumerable to indicate intent. But most of these approaches have a very low signal-to-noise ratio; and that's absolutely critical to understanding code.

For the 99% of all code that is not a public API, it's not necessary to follow Framework Guidelines: however, I still want a comprehensible code and a type that communicates intent.

So, what's the best-practice way to deal with the bog-standard task of making small collections to pass around values? Should array be preferred over List where possible? Something else entirely? What's the best way - clean, readable, reasonably efficient - of passing around such small collections? In particular, code should be obvious to future maintainers that have not read this question and don't want to read swathes of API docs yet still understand what the intent is. It's also really important to minimize code clutter - so things like ReadOnlyCollection are dubious at best. Nothing wrong with wordy types for major API's with small surfaces, but not as a general practice inside a large codebase.

What's the best way to pass around lists of values without lots of code clutter (such as explicit type parameters) but that still communicates intent clearly?

Edit: clarified that this is about making short, clear code, not about public API's.


Solution

  • Update:

    So, this isn't strictly speaking new; but this question convinced me to go ahead and announce an open source project I've had in the works for a while (still a work in progress, but there's some useful stuff in there), which includes an IArray<T> interface (and implementations, naturally) that I think captures exactly what you want here: an indexed, read-only, even covariant (bonus!) interface.

    Some benefits:

    • It's not a concrete type like ReadOnlyCollection<T>, so it doesn't tie you down to a specific implementation.
    • It's not just a wrapper (like ReadOnlyCollection<T>), so it "really is" read-only.
    • It clears the way for some really nice extension methods. So far the Tao.NET library only has two (I know, weak), but more are on the way. And you can easily make your own, too—just derive from ArrayBase<T> (also in the library) and override the this[int] and Count properties and you're done.

    If this sounds promising to you, feel free to check it out and let me know what you think.


    It's not 100% clear to me where you're worried about this "syntactic noise": in your code or in calling code?

    If you're tolerant of some "noise" in your own encapsulated code then I would suggest wrapping a T[] array and exposing an IList<T> which happens to be a ReadOnlyCollection<T>:

    class ThingsCollection
    {
        ReadOnlyCollection<Thing> _things;
    
        public ThingsCollection()
        {
            Thing[] things = CreateThings();
            _things = Array.AsReadOnly(things);
        }
    
        public IList<Thing> Things
        {
            get { return _things; }
        }
    
        protected virtual Thing[] CreateThings()
        {
            // Whatever you want, obviously.
            return new Thing[0];
        }
    }
    

    Yes there is some noise on your end, but it's not bad. And the interface you expose is quite clean.

    Another option is to make your own interface, something like IArray<T>, which wraps a T[] and provides a get-only indexer. Then expose that. This is basically as clean as exposing a T[] but without falsely conveying the idea that items can be set by index.