Search code examples
c#directoryinfofilesysteminfo

How to efficiently retrieve a list of file names and lengths


I have a C# application where at one point it scans a folder that could potentially contain 10s of thousands of files. It the filters that list by name and length and selects a relatively small number for processing.

Simplified code:

DirectoryInfo directoryInfo = new DirectoryInfo(path);
FileSystemInfo[] fileSystemInfos = directoryInfo.GetFileSystemInfos();
List<MyInfo> myInfoList = fileSystemInfos
    .Where(f => (f.Attributes & FileAttributes.Directory) != FileAttributes.Directory))
    .Select(f => new MyInfo {
        FilePath = f.FullName,
        FileSize = new FileInfo(f.FullName).Length,
        })
    .ToList();

The logic later selects a handful of files and verifies a non-zero length.

The problem is that the individual calls to FileInfo(f.FullName).Length are killing performance. Under the covers, I see that FileInfo internally stores a WIN32_FILE_ATTRIBUTE_DATA struct that contains length (fileSizeLow and fileSizeHigh), but does not exposes that as a property.

Question: Is there an simple alternative to the above that can retrieve file names and lengths efficiently without the extra FileInfo.Length call?

My alternative is to make the MyInfo.FileSize property a lazy load property, but I wanted to check for a more direct approach first.


Solution

  • (Answer from comments)

    Instead of calling GetFileSystemInfos can't you just call GetFiles? GetFiles returns FileInfo objects which have a Length property. Doing this also means that you don't have to manually weed out the directory entries as only files will be returned.