Search code examples
c#heap-memorylarge-files

How to get rid of LOH Fragmentations and Optimize the code


I'm having a simple thread running to clean some cache files from a directory. Most of the files are large objects more than 80KBytes. There are about 2000 files like that in the directory. The is as shown below.

while (true)
{
  if (Directory.Exists(CACHE_PATH))
  {
    List<FileInfo> filesList = new DirectoryInfo(CACHE_PATH).GetFiles("*", SearchOption.AllDirectories).ToList();
    long directorySize = filesList.Sum(e => e.Length);

    if (directorySize > CACHE_MAX_SIZE)
    {
      filesList.Sort(new FileInfoAccessTimeComparer());
      while (directorySize > CACHE_MAX_SIZE * 0.75)
      {
        directorySize -= filesList[0].Length;
        filesList[0].Delete();
        filesList.RemoveAt(0);
      }
    }
    filesList.Clear();
    filesList = null;
  }
  Thread.Sleep(CACHE_CLEANUP_INTERVAL);
}

I want to know whether this approach cause any LOH Fragmentation, any other Type of Enumerable type I Should use other than List (Such as ArrayPool).

Additionally, is that a good way to use

List<FileInfo> filesList = new DirectoryInfo(CACHE_PATH).GetFiles("*", SearchOption.AllDirectories).ToList();

Instead of

FileInfo[] fileInfos = new DirectoryInfo(CACHE_PATH).GetFiles("*", SearchOption.AllDirectories);
List<FileInfo> filesList = fileInfos.ToList();

Solution

  • There is no Large Object Heap concern, since your underlying array is small (only a few thousand entries) and your FileInfo objects are small. FileInfo is just metadata - it isn't the contents of the files.

    There are no glaring issues with your code. You could avoid using RemoveAt to save your many (behind the scenes) unnecessary array allocation / resize operations. The below code would achieve that (and also avoid the need for the ToList call):

    while (true)
    {
        if (Directory.Exists(CACHE_PATH))
        {
            var filesList = new DirectoryInfo(CACHE_PATH).GetFiles("*", SearchOption.AllDirectories);
            long directorySize = filesList.Sum(e => e.Length);
    
            if (directorySize > CACHE_MAX_SIZE)
            {
                filesList.OrderBy(z => z, new FileInfoAccessTimeComparer()).TakeWhile(z => directorySize > CACHE_MAX_SIZE * 0.75)
                    .ForEach(z =>
                    {
                        z.Delete();
                        directorySize -= z.Length;
                    });
            }
            filesList = null;
        }
        Thread.Sleep(CACHE_CLEANUP_INTERVAL);
    }
    

    Note to use ForEach like I have you will need to install MoreLINQ. If that is an issue, use a foreach loop instead.