Let's consider a worker role that:
The processing methods perform some Azure Storage I/O, HttpClient
calls to external APIs and Entity Framework calls. Now I want my worker role to gracefully shutdown so all pending operations are finished or cancelled in a managed manner:
RoleEntryPoint.OnStop()
is triggered. Does Azure make it for me? If not how do I enforce it?N
seconds for any pending operation to completeN
seconds cancel any operations left. The cancellation must not exceed M
seconds so that N + M < 5 minutes
. I believe 5 minutes is a guaranteed time Azure runtime will wait after it triggered OnStop()
and before it terminates the process.I'm imaging it something like this:
public override void Run() {
// create a cancellation token source
try {
// pass the token to all processing/listening routines
}
catch (Exception e) { }
}
public override void OnStop() {
try {
// trigger the cancellation token source
}
catch (Exception e) { }
}
The naive sample above assumes that all my processing routines are async top to bottom (to EF/HttpClient calls). If it's the way to go I need a working example that takes care of the preconditions (WCF host, Queue listeners).
The questions opened:
OnStop()
is triggered? This is important to fit shutdown code into 5 minutes limit.N
and M
considering all the stuff like WCF channel time outs, EF timeouts, etc. in the configuration file?Stop accepting any incoming requests once RoleEntryPoint.OnStop() is triggered. Does Azure make it for me? If not how do I enforce it?
As this official document mentioned about ServiceHost.close()
:
The Close method allows any unfinished work to be completed before returning. For example, finish sending any buffered messages.
For gracefully terminate WCF Service receiving new request but allow existing connections to continue, you could refer to this issue.
For listening Service Bus queues, you could define a CancellationTokenSource
object and invoke CancellationTokenSource.Cancel()
once RoleEntryPoint.OnStop()
is triggered.
And check whether cancellation has been requested for CancellationTokenSource
as follows:
try
{
if (!_cancellationTokenSource.IsCancellationRequested)
{
//retrieve and process the message
}
}
catch (Exception)
{
// Handle any message processing specific exceptions here
}
Allow N seconds for any pending operation to complete
Per my understanding, I assumed that you could just call Task.Delay(TimeSpan.FromSeconds(N)).Wait()
after you invoke CancellationTokenSource.Cancel()
and terminate the WCF Service in the OnStop
function. Then the pending operations would be discarded along with shutting the worker role instance down.
How to find out concrete numbers for N and M considering all the stuff like WCF channel time outs, EF timeouts, etc. in the configuration file?
I assumed that you could leverage Application Insights with your worker role to retrieve the metrics data and configure the reasonable value for N
, in order to reduce the failed request rate and quickly let your VM restart and begin processing new requests. Also you could refer to this tutorial about handling Azure OnStop event.