so I have this code: This is the main function,a parallel for loop that iterates through all the data that needs to be posted and calls a function
ParallelOptions pOpt = new ParallelOptions();
pOpt.MaxDegreeOfParallelism = 30;
Parallel.For(0, maxsize, pOpt, (index,loopstate) => {
//Calls the function where all the webrequests are made
CallRequests(data1,data2);
if (isAborted)
loopstate.Stop();
});
This function is called inside the parallel loop
public static void CallRequests(string data1, string data2)
{
var cookie = new CookieContainer();
var postData = Parameters[23] + data1 +
Parameters[24] + data2;
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(Parameters[25]);
getRequest.Accept = Parameters[26];
getRequest.KeepAlive = true;
getRequest.Referer = Parameters[27];
getRequest.CookieContainer = cookie;
getRequest.UserAgent = Parameters[28];
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version10;
getRequest.AllowAutoRedirect = false;
getRequest.ContentType = Parameters[29];
getRequest.ReadWriteTimeout = 5000;
getRequest.Timeout = 5000;
getRequest.Proxy = null;
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream = getRequest.GetRequestStream(); //open connection
newStream.Write(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
if (getResponse.Headers["Location"] == Parameters[30])
{
//These are simple get requests to retrieve the source code using the same format as above.
//I need to preserve the cookie
GetRequets(data1, data2, Parameters[31], Parameters[13], cookie);
GetRequets(data1, data2, Parameters[32], Parameters[15], cookie);
}
}
From what I have seen and been told,I understand that making these requests async is a better idea than using a parallel loop.My method is also heavy on the proccesor.I wonder how can I make these requests async,but also preserve the multithreaded aspect. I also need to keep the cookie,after the post requests finishes.
Converting the CallRequests
method to an async
is really just a case of switching the sync method calls for async ones with the await
keyword and changing the method signature to return Task
.
Something like this:
public static async Task CallRequestsAsync(string data1, string data2)
{
var cookie = new CookieContainer();
var postData = Parameters[23] + data1 +
Parameters[24] + data2;
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(Parameters[25]);
getRequest.Accept = Parameters[26];
getRequest.KeepAlive = true;
getRequest.Referer = Parameters[27];
getRequest.CookieContainer = cookie;
getRequest.UserAgent = Parameters[28];
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version10;
getRequest.AllowAutoRedirect = false;
getRequest.ContentType = Parameters[29];
getRequest.ReadWriteTimeout = 5000;
getRequest.Timeout = 5000;
getRequest.Proxy = null;
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream =await getRequest.GetRequestStreamAsync(); //open connection
await newStream.WriteAsync(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
if (getResponse.Headers["Location"] == Parameters[30])
{
//These are simple get requests to retrieve the source code using the same format as above.
//I need to preserve the cookie
GetRequets(data1, data2, Parameters[31], Parameters[13], cookie);
GetRequets(data1, data2, Parameters[32], Parameters[15], cookie);
}
}
However this, in itself, doesn't really get you anywhere because you still need to await the returned tasks in your main method. A very straightforward (if somewhat blunt) way of doing so would be to simply call Task.WaitAll()
(or await Task.WhenAll()
if the calling method itself is to become async). Something like this:
var tasks = Enumerable.Range(0, maxsize).Select(index => CallRequestsAsync(data1, data2));
Task.WaitAll(tasks.ToArray());
However, this is really pretty blunt and loses control over how many iterations are running in parallel, etc. I MUCH prefer use of the TPL dataflow library for this sort of thing. This library provides a way of chaining async (or sync for that matter) operations in parallel and passing them from one "processing block" to the next. It has a myriad of options for tweaking degrees of parallelism, buffer sizes, etc.
A detailed expose is beyond the possible scope of this answer so i'd encourage you to read up on it but one possible approach would be to simply push this to an action block - something like this:
var actionBlock = new ActionBlock<int>(async index =>
{
await CallRequestsAsync(data1, data2);
}, new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 30,
BoundedCapacity = 100,
});
for (int i=0; i <= maxsize; i++)
{
actionBlock.Post(i); // or await actionBlock.SendAsync(i) if calling method is also async
}
actionBlock.Complete();
actionBlock.Completion.Wait(); // or await actionBlock.Completion if calling method is also async
Couple of additional points that are outside the scope of my answer that I should mention in passing:
CallRequests
method is updating some external variable with its results. Where possible it's best to avoid this pattern and have the method return the results for collation later (which the TPL Dataflow library handles through TransformBlock<>
). If updating external state is unavoidable then make sure you have thought about the multithreaded implications (deadlocks, race conditions, etc.) which are outside the scope of my answer.index
which has been lost when you created a minimal description for your question? Does it index into a parameter list or something similar? If so, you can always just iterate over these directly and change the ActionBlock<int>
to an ActionBlock<{--whatever the type of your parameter is--}>