I need to search a large number of network shares for a given set of files within an MVC.net application. Doing so serially works, but is very slow.
I can use Parallel.ForEach
in a console application and it seems to work well, but Parallel.ForEach does not seem to work in Mvc.Net and async/await is recommended from what I can tell.
static void SearchAll()
{
var shares = new[] { @"\\share1\dir1", @"\\share2\dir2", @"\\share3\dir5" };
var lookfor = new[] { "file.txt", "file2.txt", "file3.jpg", "file4.xml", "file5.zip" };
var paths = new List<string>();
var sw = System.Diagnostics.Stopwatch.StartNew();
foreach(var share in shares)
{
var found = Search(share, lookfor);
paths.AddRange(found);
}
Console.WriteLine($"Found {paths.Count} files in {sw.Elapsed}");
}
static List<string> Search(string share, IEnumerable<string> files)
{
List<string> found = new List<string>();
foreach(var filename in files)
{
var path = Path.Combine(share, filename);
if (File.Exists(path))
{
found.Add(path);
}
}
return found;
}
I hoping to be able to use async/await for searching directories within an MVC.NET Controller Action, but haven't been able to get it to work. Since there is no File.ExistsAsync
for EnumerateFilesAsync
, I'm not sure the best way to wrap those synchronous calls to enable searching multiple directories. Seems like this problem is suited for async/await due to network/IO bound aspect.
Since there is no File.ExistsAsync for EnumerateFilesAsync, I'm not sure the best way to wrap those synchronous calls to enable searching multiple directories. Seems like this problem is suited for async/await due to network/IO bound aspect.
Unfortunately, yes. These are I/O-based operations and should have asynchronous APIs, but the Win32 API does not support asynchrony for these kinds of directory-ish operations. Curiously, the device driver layer does (even for local disks), so all the underlying support is there; we just can't get at it.
Parallel.ForEach
should work on ASP.NET; it's just not recommended. This is because it will interfere with the ASP.NET thread pool heuristics. E.g., if you do a large Parallel
operation, other incoming requests may have to wait longer to be processed due to thread pool exhaustion. There are some mitigations for this, like setting the min number of thread pool threads to the default plus whatever your MaxDegreeOfParallelism
is (and ensuring there's only one Parallel
at a time). Or you could go as far to break out the file enumeration into a separate (private) API call so it exists in its own AppDomain on the same server, with its own separate thread pool.