I wrote a .NET application that's designed to take a large number (>10,000) of very small files from one directory and organize them into a directory tree that goes root/yyyy/MM/dd/file.name. The application works, and it's fast for smallish numbers (<1,000) of files, but the more files I have to move, the longer it takes. I'm a newbie at .NET and C#, but could something like running the move in parallel make it go faster? Or compressing the files in batches before the move? Ultimately, I'm trying to avoid problems with the program stalling or failing when the number of files to move gets too large.
This is the code I'm using:
using System;
using System.IO;
using System.Configuration;
namespace consolefilemover
{
internal class Program
{
static void Main(string[] args)
{
string rootDir = ConfigurationManager.AppSettings["rootDir"];
string[] files = Directory.GetFiles(rootDir);
string log = "auditlog.txt";
foreach (string filePath in files)
{
FileInfo fileInfo = new FileInfo(filePath);
DateTime lastModifiedDate = fileInfo.LastWriteTime;
// Create the destination directory path based on the last modified date
string destinationDir = Path.Combine(rootDir, lastModifiedDate.ToString("yyyy"), lastModifiedDate.ToString("MM"), lastModifiedDate.ToString("dd"));
// Create the destination directory if it doesn't exist
Directory.CreateDirectory(destinationDir);
// Move the file to the destination directory
string destinationFilePath = Path.Combine(destinationDir, Path.GetFileName(filePath));
if (!File.Exists(destinationFilePath))
{
File.Move(filePath, destinationFilePath);
}
else
{
destinationFilePath = Path.Combine(destinationDir, Path.GetFileNameWithoutExtension(filePath) + DateTime.Now.ToString("yyyyMMddHHmmss") + Path.GetExtension(filePath));
File.Move(filePath, destinationFilePath);
}
//Making a location for monthly audit logs
string logfile = Path.Combine(rootDir, lastModifiedDate.ToString("yyyy"), lastModifiedDate.ToString("MM"), log);
//Define the data for the log file
string logInfo = Environment.NewLine + lastModifiedDate.ToString("yyyyMMdd") + " | Source: " + filePath + " | dest: " + destinationFilePath;
//Creates Log File
File.AppendAllText(logfile, logInfo);
}
}
}
}
Start by replacing Directory.GetFiles
with Directory.EnumerateFiles
. This will allow you to avoid large array allocation.
Then, create a HashSet
of created directories and check it first before calling Directory.CreateDirectory
. This will save you a lot of I/O if their values are repeated frequently.
Also replace string concatination
Path.GetFileNameWithoutExtension(filePath) + DateTime.Now.ToString("yyyyMMddHHmmss") + Path.GetExtension(filePath)
With interpolation
$"{Path.GetFileNameWithoutExtension(filePath)}{DateTime.Now.ToString("yyyyMMddHHmmss")}{Path.GetExtension(filePath)}"
To reduce number of intermediate string allocations.