I've a c# Dictionary<DateTime,SomeObject>
instance.
I've the following code:
private Dictionary<DateTime, SomeObject> _containedObjects = ...;//Let's imagine you have ~4000 items in it
public IEnumerable<SomeObject> GetItemsList(HashSet<DateTime> requiredTimestamps){
//How to return the list of SomeObject contained in _containedObjects
//Knowing that rarely(~<5% of the call), one or several DateTime of "requiredTimestamps" may not be in _containedObjects
}
I'm looking how to return an IEnumerable<SomeObject>
containing all element that were referenced by one of the provided keys. The only issue is that this method will be called very often, and we might not always have every given key in parameter.
So is there something more efficient than this:
private Dictionary<DateTime, SomeObject> _containedObjects = ...;//Let's imagine you have ~4000 items in it
public IEnumerable<SomeObject> GetItemsList(HashSet<DateTime> requiredTimestamps){
List<SomeObject> toReturn = new List<SomeObject>();
foreach(DateTime dateTime in requiredTimestamps){
SomeObject found;
if(_containedObjects.TryGetValue(dateTime, out found)){
toReturn.Add(found);
}
}
return toReturn;
}
Method 1:
To make this significantly faster - this is not by changing the algorithm but by making a local copy of _containedObjects
in your method and referencing the local copy for the lookup.
Example:
public static IEnumerable<int> GetItemsList3(HashSet<DateTime> requiredTimestamps)
{
var tmp = _containedObjects;
List<int> toReturn = new List<int>();
foreach (DateTime dateTime in requiredTimestamps)
{
int found;
if (tmp.TryGetValue(dateTime, out found))
{
toReturn.Add(found);
}
}
return toReturn;
}
Test data and times (on set of 5000 items with 125 keys found):
Your original method (milliseconds): 2,06032186895335
Method 1 (milliseconds): 0,53549626223609
Method 2: One way to make this marginally quicker is to iterate through the smaller set and do the lookup on the bigger set. Depending on the size difference you will gain some speed.
You are using a Dictionary and HashSet, so your lookup on either of these will be O(1).
Example: If _containedObjects
has less items than requiredTimestamps
we loop through _containedObjects
(otherwise use your method for the converse)
public static IEnumerable<int> GetItemsList2(HashSet<DateTime> requiredTimestamps)
{
List<int> toReturn = new List<int>();
foreach (var dateTime in _containedObjects)
{
int found;
if (requiredTimestamps.Contains(dateTime.Key))
{
toReturn.Add(dateTime.Value);
}
}
return toReturn;
}
Test data and times (on set of 5000 for _containedObjects
and set of 10000 items for requiredTimestamps
with 125 keys found):
Your original method (milliseconds): 3,88056291367086
Method 2 (milliseconds): 3,31025939438943