I want to check the inclusion of one list in another list.
my code
List<string> words = new(){"the","play","in","on"};
string txt="Barcelona is playing against Manchester tonight.";
List<string> txtList=txt.Split(' ', StringSplitOptions.RemoveEmptyEntries).ToList();
bool exist = false;
Stopwatch sw;
Stopwatch sw2;
sw = Stopwatch.StartNew();
for (int i=0; i<1000 ;i++)
{
exist = words.Exists(w => txtList.Exists(t => t.Contains(w)));
}
sw.Stop();
Console.WriteLine("ExistResult: "+ exist + Environment.NewLine + "ExistTime: " + sw.ElapsedTicks);
sw2 = Stopwatch.StartNew();
for (int i=0; i<1000 ;i++)
{
exist = words.Any(w => txtList.Any(t => t.Contains(w)));
}
sw2.Stop();
Console.WriteLine("AnyResult: "+ exist + Environment.NewLine + "AnyTime: " + sw2.ElapsedTicks);
Result
ExistResult: True
ExistTime: 2574053
AnyResult: True
AnyTime: 1265826
Method Any is twice as fast
But why does Visual Studio recommend that I use Exist method?
Benchmarking is hard, and you should never try to DYI your way to one. I couldn't even begin to tell you which of the million little rules you messed up, but a proper benchmark shows pretty much what you'd expect:
| Method | N | Mean | Error | StdDev |
|------- |------ |----------:|---------:|---------:|
| Exists | 1000 | 27.17 ms | 0.257 ms | 0.241 ms |
| Any | 1000 | 53.19 ms | 0.614 ms | 0.575 ms |
| Exists | 5000 | 132.33 ms | 0.962 ms | 0.852 ms |
| Any | 5000 | 259.50 ms | 1.104 ms | 0.979 ms |
| Exists | 10000 | 245.17 ms | 2.417 ms | 2.261 ms |
| Any | 10000 | 521.30 ms | 4.468 ms | 4.179 ms |
In other words, Exists
is about twice as fast as Any
.