Search code examples
c#asp.netasp.net-coreioconsole-application

Need recommendation to improve the file content search program I am writing using ASP.NET Core


I am writing a console program that takes the following parameters-

  1. Directory Path
  2. file extension
  3. search date in "yyyy-mm-dd" format
  4. search text

It searches all xml files with specific extension (param 2) in the given directory (param 1) with modified date (param 3), containing text (param 4).

The target directory has around 55000 xml files right now.

How can I improve the performance of this program?

Any comments on what could go wrong?

Updated code to reflect changes per Ashkan's response, instead of checking the Date on the filename I am comparing against the actual file written Date. Also added try catch block.

Following is the program that I wrote in ASP.NET Core 2.2

try
{
    var dirPath = args[0];
    var fileExtension = args[1];
    var searchDate = args[2];
    var searchText = args[3];

    DirectoryInfo dir = new DirectoryInfo(dirPath);

    IEnumerable<FileInfo> filelist = dir.GetFiles(fileExtension, SearchOption.AllDirectories)
                                        .Where(file => file.LastWriteTime.ToString("yyyy-MM-dd") == searchDate);

    var foundFilesCtr = 0;

    Console.WriteLine($"Searching for {searchText} in {dir}");
    Console.WriteLine("------------------------------------");
    Console.WriteLine("Search results...");
    Console.WriteLine($"Found {filelist.Count()} files with extenstion {fileExtension} and dated {searchDate}");

    foreach (var item in filelist)
        if (File.ReadAllLines(item.FullName).Contains(searchText))
        {
            Console.WriteLine($"File with selected content: {item.FullName}");
            foundFilesCtr++;
        }

    Console.WriteLine($"Found {foundFilesCtr} files with text {searchText}");
    Console.WriteLine("------------------------------------");
}
catch(Exception ex)
{
    Console.WriteLine(ex.InnerException);
}

Solution

  • 1.Instead of getting all files and the filtering them get only files with given extension:

    string[] filelist = Directory.GetFiles(fileExtension ,SearchOption.AllDirectories)
               .Where(file => Path.GetFilenameWithoutExtension.Contains(searchDate).ToArray();
    

    2.Although the files are XML, but you are treating them as a string (xdoc.Document.ToString().Contains(searchText)), so just load them as string and save time of XML Document load time:

    foreach(var file in files)
        if(File.ReadAllText(file).Contains(searchText))
            foundFilesCtr++;