Search code examples
c#performancelinq

IEnumerable.Count() or ToList().Count


I got List of objects of my own class which looks like:

public class IFFundTypeFilter_ib
{
    public string FundKey { get; set; }
    public string FundValue { get; set; }
    public bool IsDisabled { get; set; }
}

The property IsDisabled is set by doing query collection.Where(some condition) and counting the number of matching objects. The result is IEnumarable<IFFundTypeFilter_ib> which does not contain property Count. I wonder, what would be faster.

This one:

collection.Where(somecondition).Count();

or this one:

collection.Where(someocondition).ToList().Count;

Collection could contains few objects but could also contains, for example 700. I am going to make counting call two times and with other conditions. In first condition I check whether FundKey equals some key and in the second condition I do the same, but I compare it with other key value.


Solution

  • You asked:

    I wonder, what would be faster.

    Whenever you ask that you should actually time it and find out.

    I set out to test all of these variants of obtaining a count:

    var enumerable = Enumerable.Range(0, 1000000);
    var list = enumerable.ToList();
    
    var methods = new Func<int>[]
    {
        () => list.Count,
        () => enumerable.Count(),
        () => list.Count(),
        () => enumerable.ToList().Count(),
        () => list.ToList().Count(),
        () => enumerable.Select(x => x).Count(),
        () => list.Select(x => x).Count(),
        () => enumerable.Select(x => x).ToList().Count(),
        () => list.Select(x => x).ToList().Count(),
        () => enumerable.Where(x => x % 2 == 0).Count(),
        () => list.Where(x => x % 2 == 0).Count(),
        () => enumerable.Where(x => x % 2 == 0).ToList().Count(),
        () => list.Where(x => x % 2 == 0).ToList().Count(),
    };
    

    My testing code explicitly runs each method 1,000 times, measures each execution time with a Stopwatch, and ignores all results where garbage collection occurred. It then gets an average execution time per method.

    var measurements =
        methods
            .Select((m, i) => i)
            .ToDictionary(i => i, i => new List<double>());
    
    for (var run = 0; run < 1000; run++)
    {
        for (var i = 0; i < methods.Length; i++)
        {
            var sw = Stopwatch.StartNew();
            var gccc0 = GC.CollectionCount(0);
            var r = methods[i]();
            var gccc1 = GC.CollectionCount(0);
            sw.Stop();
            if (gccc1 == gccc0)
            {
                measurements[i].Add(sw.Elapsed.TotalMilliseconds);
            }
        }
    }
    
    var results =
        measurements
            .Select(x => new
            {
                index = x.Key,
                count = x.Value.Count(),
                average = x.Value.Average().ToString("0.000")
            });
    

    Here are the results (ordered from slowest to fastest):

    +---------+-----------------------------------------------------------+
    | average |                          method                           |
    +---------+-----------------------------------------------------------+
    | 14.879  | () => enumerable.Select(x => x).ToList().Count(),         |
    | 14.188  | () => list.Select(x => x).ToList().Count(),               |
    | 10.849  | () => enumerable.Where(x => x % 2 == 0).ToList().Count(), |
    | 10.080  | () => enumerable.ToList().Count(),                        |
    | 9.562   | () => enumerable.Select(x => x).Count(),                  |
    | 8.799   | () => list.Where(x => x % 2 == 0).ToList().Count(),       |
    | 8.350   | () => enumerable.Where(x => x % 2 == 0).Count(),          |
    | 8.046   | () => list.Select(x => x).Count(),                        |
    | 5.910   | () => list.Where(x => x % 2 == 0).Count(),                |
    | 4.085   | () => enumerable.Count(),                                 |
    | 1.133   | () => list.ToList().Count(),                              |
    | 0.000   | () => list.Count,                                         |
    | 0.000   | () => list.Count(),                                       |
    +---------+-----------------------------------------------------------+
    

    Two things come out that are significant here.

    One, any method with a .ToList() inline is significantly slower than the equivalent without it.

    Two, LINQ operators take advantage of the underlying type of the enumerable, where possible, to short-cut computations. The enumerable.Count() and list.Count() methods show this.

    There is no difference between the list.Count and list.Count() calls. So the key comparison is between the enumerable.Where(x => x % 2 == 0).Count() and enumerable.Where(x => x % 2 == 0).ToList().Count() calls. Since the latter contains an extra operation we would expect it to take longer. It's almost 2.5 milliseconds longer.

    I don't know why you say that you're going to call the counting code twice, but if you do it is better to build the list. If not just do the plain .Count() call after your query.