I have the following method for looping through a table, changint some values in every row and saving the chages back to the database. To make things go faster I am fetching data in sets of 10,000 rows. This is a large table with over 25 million records in it.
The problem is that my application doesn't seems to be releasing any memory. I have tried redeclaring the records
variable to nothing
or explicitly caling the garbage collector but the memory stays there.
Runnning the built-in VS10 profiler I can see that the culprit is the system.linq.enumerable.tolist()
method which takes up over 98% of my memory. How do I release that memory after the call to saveChanges
?
db = New databaseEntities
Dim size = 25000000
Dim stepSize = 10000
For i = 0 to size Step stepSize
Dim sql = (From A In db.table).OrderBy(Function(A) A.Column).Skip(i).Take(stepSize)
Dim records As New List(Of table)
records = sql.ToList
For Each record In records
'do some work
Next
db.SaveChanges()
records = Nothing
GC.Collect()
Next
The repository holds a reference to each entity it is tracking, so you won't be able to dispose an entity whilst the repository is live and tracking it. That means you either need to dispose the repository, or detach each entity after you've finished processing it.
Option 1) if "do some work" doesn't affect the order you'd return records in, you could move the creation of databaseEntities inside the For loop, and declare it with a using block. That should cause each block of entities to be release each time round the for loop
Option 2) If your operation is essentially parallel, and what you do to one "table" entity doesn't have any dependencies on any other, then you could call databaseEntities.Detach(record) after db.SaveChanges, which will enable the garbage collector to reclaim the entity's space.
Looking at your code, I suspect either of thse could be used