I wrote a piece of testing code using AsParallel
to concurrently read big files. It causes memory leak. It seems GC doesn’t recycle the unused objects as expected. Please see the code snippet.
static void Main(string[] args)
{
int[] array = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
//foreach (var element in array) // No leaks
//{
// ReadBigFile();
//}
array.AsParallel().ForAll(l => ReadBigFile()); // Memory leak
Console.ReadLine();
}
private static void ReadBigFile()
{
List<byte[]> lst = new List<byte[]>();
var file = "<BigFilePath>"; // 600 Mb
lst.Add(File.ReadAllBytes(file));
}
I tried this with both synchronized and parallel way. The synchronized foreach
runs OK as no memory leaks. But when I use AsParallel
to read the file concurrently, memory leak happens as it took 6 GB size of memory and never go back down.
Please help to identify what the root cause is? And what the write things to do if I want to concurrently complete the same task? Thank you.
PS: The issue happens in both dotnet framework (4.6.1) and dotnet core (6.0).
@DiplomacyNotWar I like your previous comment
If you do it sequentially then you'll have no references to the object from the previous loop, which makes it available for moving to the next generation and ultimately garbage collection. Blockquote
Then I modified the code as
int[] array = new[] { 1, 2 };
for (int i = 0; i < 5; i++)
{
array.AsParallel().ForAll(l => ReadBigFile());
}
No I can see the memory allocation is only 1.1GB, which should be the memory sized needed for last round of the loop. So I think now I'm convinced, it's just a timing issue of GC and not real memory leaks. Thank you @DiplomacyNotWar very very much!