I have a question about how garbage collection might be handled in a linq query. Suppose I am given a list of requests to process. Each request generates a very large set of data, but then a filter is applied to only keep critical data from each requested load.
//Input data
List<request> requests;
IEnumerable<filteredData> results = requests.Select(request => Process(request)).Select(data => Filter(data));
So I know that the query is deferred for each data item until each filtered data item is requested, so thats good. But does that middle memory-intense part persist until the enumerable is completed?
What I am hoping happens is that each data element can be garbage collected as soon as it passes the filtered stage, thus making sure I have enough memory to process the whole list. Is this the case, or does the middle enumerable keep everything around until the entire query ends? If so, is there a linq way to deal with this?
note: the Process() function generates the memory intensive data... thats what I'm worried about
As long as the return value of Process(...)
and Filter(...)
do not hold any references to the "large data items" used internally, then the memory used in that process should become unrooted and a candidate for GC after each element is processed.
This doesn't mean it will get collected, only that it will be a candidate. If memory pressure gets high, the GC will most likely collect it.