I am writing a console program that takes the following parameters-
It searches all xml files with specific extension (param 2) in the given directory (param 1) with modified date (param 3), containing text (param 4).
The target directory has around 55000 xml files right now.
How can I improve the performance of this program?
Any comments on what could go wrong?
Updated code to reflect changes per Ashkan's response, instead of checking the Date on the filename I am comparing against the actual file written Date. Also added try catch block.
Following is the program that I wrote in ASP.NET Core 2.2
try
{
var dirPath = args[0];
var fileExtension = args[1];
var searchDate = args[2];
var searchText = args[3];
DirectoryInfo dir = new DirectoryInfo(dirPath);
IEnumerable<FileInfo> filelist = dir.GetFiles(fileExtension, SearchOption.AllDirectories)
.Where(file => file.LastWriteTime.ToString("yyyy-MM-dd") == searchDate);
var foundFilesCtr = 0;
Console.WriteLine($"Searching for {searchText} in {dir}");
Console.WriteLine("------------------------------------");
Console.WriteLine("Search results...");
Console.WriteLine($"Found {filelist.Count()} files with extenstion {fileExtension} and dated {searchDate}");
foreach (var item in filelist)
if (File.ReadAllLines(item.FullName).Contains(searchText))
{
Console.WriteLine($"File with selected content: {item.FullName}");
foundFilesCtr++;
}
Console.WriteLine($"Found {foundFilesCtr} files with text {searchText}");
Console.WriteLine("------------------------------------");
}
catch(Exception ex)
{
Console.WriteLine(ex.InnerException);
}
1.Instead of getting all files and the filtering them get only files with given extension:
string[] filelist = Directory.GetFiles(fileExtension ,SearchOption.AllDirectories)
.Where(file => Path.GetFilenameWithoutExtension.Contains(searchDate).ToArray();
2.Although the files are XML, but you are treating them as a string (xdoc.Document.ToString().Contains(searchText)
), so just load them as string and save time of XML Document load time:
foreach(var file in files)
if(File.ReadAllText(file).Contains(searchText))
foundFilesCtr++;