Search code examples

Linq deferred execution

I wrote a simple program, here's what it looks like, some details hidden:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace routeaccounts
    class Program
        static void Main(string[] args)
            //Draw lines from source file
            var lines = File.ReadAllLines("accounts.txt").Select(p => p.Split('\t'));
            //Convert lines into accounts
            var accounts = lines.Select(p => new Account(p[0], p[1], p[2], p[3]));
            //Submit accounts to router
            var results = accounts.Select(p => RouteAccount(p));
            //Write results list to target file
            WriteResults("results.txt", results);

        private static void WriteResults(string filename, IEnumerable<Result> results)
            ... disk write call ...

        private static Result RouteAccount(Account account)
            ... service call ...

My question is this - obviously, when selecting from a data context, execution is deferred. If you notice, in the first statement of the 'Main' function, I'm querying from File.ReadAllLines("accounts.txt"). Is this a bad choice? If I enumerate the final result, will this statement be repeatedly?

I can simply .ToArray() or grab the results ahead of time, if I know it's a problem, but I'm interested to know what's going on behind the scenes.


  • It's not going to read the file repeatedly, no - because that part of execution isn't deferred. It will return an array, and then the call to Select will return you a sequence... the projection will be deferred, but the reading of the file won't. That array will stay in memory until everything referring to it (directly or indirectly) is eligible for garbage collection... it won't need to reread the file.

    On the other hand, you may want to read the results using ToList() or something similar anyway - because that way, you get to find out any errors before you start to write the results. It's quite often a good idea to make sure you've got all the data you need before you start executing code with side effects (which I imagine WriteResults does). Obviously it's less efficient in terms of the amount of data needed in memory at a time though... it's a balance you'll have to weigh up yourself.