Search code examples
c#linq.net-6.0

Why is the Enumerable.Any(Func<TSource, bool> predicate) slow compared to a foreach with an if statement when searching a List<T>


Something has piqued my curiosity recently..

Why is the Enumerable.Any(Func<TSource, bool> predicate) method so much slower than manual foreach, when they do the same thing?

I've been messing with some benchmarks and thought of this. I'm checking of a List<int> contains and item that's approximately in the half of the list.

Here are my test results for a few diffent sizes of the list:

Items: 1 000, searched item: 543

Method Mean Ratio Allocated Alloc Ratio
Foreach 838.3 ns 1.00 - NA
Any 3,348.8 ns 4.05 40 B NA

Items: 10 000, searched item: 5 432

Method Mean Ratio Allocated Alloc Ratio
Foreach 7.988 us 1.00 - NA
Any 30.991 us 3.88 40 B NA

Items: 100 000, searched item: 54 321

Method Mean Ratio Allocated Alloc Ratio
Foreach 82.35 us 1.00 - NA
Any 328.86 us 4.00 40 B NA

There are two benchmarks:

  • Foreach: manual foreach with an if statement
  • Any: LINQ's Any method (that turns into Enumerable.Any)

Here's my code for the benchmarks (using BenchmarkDotNet, .NET 6.0 console app running in Release mode):

[MemoryDiagnoser(displayGenColumns: false)]
[HideColumns("Error", "StdDev", "RatioSD")]
public class Benchmarks
{
    private readonly List<int> _items;
    private readonly Func<int, bool> _filter;

    public Benchmarks()
    {
        _items = Enumerable.Range(1, 10_000).ToList();
        _filter = x => x == 5432;
    }

    [Benchmark(Baseline = true)]
    public bool Foreach()
    {
        if (_items is null)
        {
            throw new ArgumentNullException(nameof(_items));
        }

        if (_filter is null)
        {
            throw new ArgumentNullException(nameof(_filter));
        }

        foreach (var item in _items)
        {
            if (_filter(item))
            {
                return true;
            }
        }

        return false;
    }

    [Benchmark]
    public bool Any()
    {
        return _items.Any(_filter);
    }
}

The Any approach is 4 times slower and allocates a bit of memory despite my best attempts to optimize it.

I tried to make the Any approach faster by caching the predicate (Func<int, bool>) in a variable (_filter). However, it still allocates 40B and I have no idea why...

When decompiled, the Any approach turns into Enumerable.Any(Func<TSource, bool> predicate) method:

public static bool Any<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
    if (source == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.source);
    }

    if (predicate == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.predicate);
    }

    foreach (TSource element in source)
    {
        if (predicate(element))
        {
            return true;
        }
    }

    return false;
}

How is the Any approach different from the Foreach approach? Just curious...


Solution

  • As Jon Skeet suggested in the comments, I tried changing the _items collection from a List<int> to IEnumerable<int> to make the comparison fair. In short, that seems to be the key difference. My Foreach seems to be taking advantage of that fact that it know the _items collection is a List<T> while the Enumerable.Any method takes an IEnumerable<int>.

    Here are the benchmark results for that:

    Items: 1 000, searched item: 543

    Method Mean Ratio Allocated Alloc Ratio
    Foreach 2.126 us 1.00 40 B 1.00
    Any 2.131 us 1.00 40 B 1.00

    Items: 10 000, searched item: 5 432

    Method Mean Ratio Allocated Alloc Ratio
    Foreach 21.35 us 1.00 40 B 1.00
    Any 21.20 us 0.99 40 B 1.00

    Items: 100 000, searched item: 54 321

    Method Mean Ratio Allocated Alloc Ratio
    Foreach 220.7 us 1.00 40 B 1.00
    Any 219.1 us 0.99 40 B 1.00

    When working with IEnumerable<int>, the two approaches are performing the same. Thanks Jon Skeet!