I have ~500 tasks, each of them takes ~5 seconds where most of the time is wasted on waiting for the remote resource to reply. I would like to define the number of threads that should be spawned myself (after some testing) and run the tasks on those threads. When one task finishes I would like to spawn another task on the thread that became available.
I found System.Threading.Tasks
the easiest to achieve what I want, but I think it is impossible to specify the number of tasks that should be executed in parallel. For my machine it's always around 8 (quad core cpu). Is it possible to somehow tell how many tasks should be executed in parallel? If not what would be the easiest way to achieve what I want? (I tried with threads, but the code is much more complex). I tried increasing MaxDegreeOfParallelism
parameter, but it only limits the maximum number, so no luck here...
This is the code that I have currently:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
private static List<string> _list = new List<string>();
private static int _toProcess = 0;
static void Main(string[] args)
{
for (int i = 0; i < 1000; ++i)
{
_list.Add("parameter" + i);
}
var w = new Worker();
var w2 = new StringAnalyzer();
Parallel.ForEach(_list, new ParallelOptions() { MaxDegreeOfParallelism = 32 }, item =>
{
++_toProcess;
string data = w.DoWork(item);
w2.AnalyzeProcessedString(data);
});
Console.WriteLine("Finished");
Console.ReadKey();
}
static void Done(Task<string> t)
{
Console.WriteLine(t.Result);
--_toProcess;
}
}
class Worker
{
public string DoWork(string par)
{
// It's a long running but not CPU heavy task (downloading stuff from the internet)
System.Threading.Thread.Sleep(5000);
return par + " processed";
}
}
class StringAnalyzer
{
public void AnalyzeProcessedString(string data)
{
// Rather short, not CPU heavy
System.Threading.Thread.Sleep(1000);
Console.WriteLine(data + " and analyzed");
}
}
}
As L.B mentioned, .NET Framework has methods that performs I/O operations (requests to databases, web services etc.) using IOCP internally, they can be recognized by their names - it ends with Async by convention. So you could just use them to build robust scalable applications that can process multiple requests simultaneously.
EDIT: I've completely rewritten the code example with the modern best practices so it becomes much more readable, shorter and easy to use.
For the .NET 4.5 we can use async-await approach:
class Program
{
static void Main(string[] args)
{
var task = Worker.DoWorkAsync();
task.Wait(); //stop and wait until our async method completed
foreach (var item in task.Result)
{
Console.WriteLine(item);
}
Console.ReadLine();
}
}
static class Worker
{
public async static Task<IEnumerable<string>> DoWorkAsync()
{
List<string> results = new List<string>();
for (int i = 0; i < 10; i++)
{
var request = (HttpWebRequest)WebRequest.Create("http://microsoft.com");
using (var response = await request.GetResponseAsync())
{
results.Add(response.ContentType);
}
}
return results;
}
}
Here is the nice MSDN tutorial about async programming using async-await.