I need to search through a collection of "Item" objects that contain a time property. I have a fast solution in place but its very messy (will post if nothing better comes up). Below are my attempts at performing the search myself using LINQ and whatnot.
In my particular case, I know the items are ordered low to high based on time. When I iterate through them, it's 9/12, 9/13, 9/14. I'd like to find a solution that's fast even if this isn't ordered, but not important right now.
//ICollection c = GetCollection(); //25,000+ items
DateTime TIME = DateTime.Now.AddDays(-1);
EventLog scan = new EventLog("Application", "Server", "N/A");
EventLogCollection c = scan.Entries;
Console.WriteLine(logs.Count); // All entries already in list here
// 64 sec - SLOW
List<Item> l1 = new List<Item>();
foreach (Item i in c) {
if (i.time > TIME) {
l1.Add(i); }
}
// 93 sec - SLOWER
var l2 = c.Cast<Item>().AsParallel().Select(n => n.time > TIME);
var i = l2.Count();
// 98 sec - EVEN SLOWER!
List<Item> l3 = new List<Item>();
Parallel.ForEach(c.Cast<Item>(), n => {
if (n.time > TIME) {
l3.add(n);
}
});
My current solution is do a BinarySearch for the start and end times and loop through the ICollection based on those indexes. It's very fast (1-2 sec) but very messy. I don't need a different solution but thought I'd toss this out to you performance experts.
Is there a faster, elegant to way to search the ICollection? BTW I have no control over the collection I'm given and can't change it's structure. .NET 4.0
And no, I can't use System.Diagnostics.Eventing.Reader because I'm stuck with Windows XP
EDIT
On the assertion that dasblinkenlight's answer is generally right I wrote a quick generic helper function.
public static int BinaryFirstIndexOf<T>(
Func<int, T> indexer,
Func<T, boo> predicate,
int count)
{
var low = 0;
var high = count - 1;
while (low < (high - 1))
{
var mid = low + ((high - low) / 2);
if (predicate(indexer(mid))
{
high = mid;
}
else
{
low = mid;
}
}
if (predicate(indexer(low)))
{
return low;
}
if (low != high && predicate(indexer(high)))
{
return high;
}
return -1;
}
Which I would use like this,
var time = DateTime.Now.AddDays(-1);
var c = scan.Entries;
var first = BinaryFirstIndexOf(
i => c[i],
e => e.TimeGenerated > time,
c.Count);
if (first >= 0)
{
var result = new List<Item>(c.Count - first);
Enumerable.Range(first, c.Count - first).AsParallel()
.ForAll(i =>
{
var j = i - first;
result[j] = (Item)c[i];
});
}
Don't you want to do something like this
var time = DateTime.Now.AddDays(-1);
var c = scan.Entries;
var cutOff = 0;
for (var i = c.Count - 1; i > - 1; i--)
{
if (c[i].TimeGenerated < time)
{
cutOff = i;
break;
}
}
var result = new List<Item>(c.Count - cutOff - 1);
var j = 0;
for (var i = cutOff + 1; i < c.Count; i ++)
{
result[j] = (Item)c[i];
j++;
}
I'm assuming the data you want is on the end and takes up less that half the collection.
Or perhaps using linq to do the casting in parallel,
var time = DateTime.Now.AddDays(-1);
var c = scan.Entries;
var cutOff = 0;
for (var i = c.Count - 1; i > - 1; i--)
{
if (c[i].TimeGenerated < time)
{
cutOff = i;
break;
}
}
var result = new List<Item>(c.Count - cutOff - 1);
Enumerable.Range(cutOff + 1, c.Count - cutOff - 1).AsParallel()
.ForAll(i =>
{
var j = i - cutOff - 1;
result[j] = (Item)c[i];
});