Search code examples
linqperformancedelayed-execution

Is LINQ faster or just more convenient?


Which of theses scenarios would be faster?

Scenario 1:

foreach (var file in directory.GetFiles())
{
    if (file.Extension.ToLower() != ".txt" &&
        file.Extension.ToLower() != ".bin")
        continue;

    // Do something cool.
}

Scenario 2:

var files = from file in directory.GetFiles()
                where file.Extension.ToLower() == ".txt" ||
                      file.Extension.ToLower() == ".bin"
                select file;

foreach (var file in files)
{
     // Do something cool.
} 

I know that they are logically the same because of delayed execution, but which would be the faster? And why?


Solution

  • Faster isn't usually the issue per se, especially in a scenario like this where there is not going to be a meaningful performance difference (and in general, if the code is not a bottleneck it just doesn't matter). The issue is which is more readable and more clearly expresses the intent of the code.

    I think the second block of code more clearly expresses the intent of the code. It reads as "query a collection of file names for some file names with a certain property" and then "for each of those file names with that property, do something." It declares what is happening, rather than how it is going to happen. Separating the what from the mechanism is what makes the second block of code clearer and where LINQ really shines. Use LINQ to declare the what, and let LINQ implement the mechanism instead of in the past where the what would be muddled with the mechanism.

    Is LINQ faster or just more convenient?

    So, to answer the question in your title, LINQ usually does not materially hinder performance but it makes code more clear by allowing the coder to declare what they want done instead of having to focus on how they want something done. At the end of the day, we don't care about the how, we care about the what.

    I know that they are logically the same because of delayed execution, but which would be the faster?

    Probably the imperative version because there is a tiny amount of overhead in using LINQ. But if you really must know which is faster be sure to use a profiler, and be sure to test on real-world data.

    And why?

    Because LINQ adds a little bit of overhead. But the trade off is significantly clearer and more maintainable code. That is a huge win compared to the usually irrelevant performance loss.