Search code examples
c#listdata-structuressortedlist

C# Data Structures Question (Which collection to use?)


I need to implement a large collection of Widget objects, each of which contain a unique file path string ("FilePath"). I need to be able to do the following:

  1. Retrieve a Widget object quickly given the file path
  2. Change the file path of a Widget without creating a new object (multiple other objects may contain references to a single Widget, and tracking them down would impact performance)
  3. Given a Widget reference, determine it's file path

I first thought of using a generic SortedList using the file path as a key, but duplicating the path for many thousands of objects could quickly eat up memory. I considered removing the path from the object and only storing it in the list of keys, but that would make requirement 3 above hard to accomplish.

What I'm leaning towards now is rolling my own class derived from List<> that adds the Widget objects in a sorted order, and retrieves them with a binary search. Requirement 2 can be accomplished simply by removing an object from the list, changing it's file path, and adding it back to the list.

But I'm relatively new to C# and I wanted to check with the great minds here and see if I'm missing another obvious solution.

Thanks!


Solution

  • "Duplicating" the strings will not use twice the memory: Since strings are immutable objects in c#, you will just store another reference (i.e. pointer, 4 or 8 byts) per entry in the dictionary:

    Dictionary<string, Widget> dict = new Dictionary<string, Widget>();
    Widget myWidget = GetSomeWidget();
    dict.Add(myWidget.Name, myWidget);
    

    You will always be reusing the string object from the widget's property, so just go on with the dict and store the path as property inside the widget.

    If you don't need to enumerate the widgets in sorted order, don't use the SortedList, it will be slower than the Dictionary (O(n log n) insertion/deletion/retrieval vs. O(n) average time)

    Changing the widget's path will need you to remove it from the dictionary and add it with the changed path, but this is an average constant time operation, so it should be pretty fast.

    And just to mention it: Even if you would have to spend one MB of additional memory for getting more performance or using a better-suited (and well-tested) data structure, I don't think that would be a great problem considering the amount of memory other applicatins are using (wasting?) these days ...