Search code examples
c#wcfwindows-services

WCF Service hosted in Windows Service hangs on stop


I have a WCF Service hosted in a Windows service as described here. I have scheduled nightly restart of the service, but sometimes the restart fails and the service remains/hangs in Stopping state and the EXE process has to be killed manually. It looks likely that it hangs on line _ESSServiceHost.Close();, because nothing after that line is logged it the log file. It is possible but not very likely that the service gets the stop request when it is busy. Moreover the underlying process cannot be killed because it is dependent on services.exe, so only server restart works.

What could be wrong with this approach?

  protected override void OnStop()
    {
        try
        {
            if (_ESSServiceHost != null)
            {
                _ESSServiceHost.Close();
                _ESSServiceHost = null;
                //Never reaches the following line
                Tools.LogInfo("Services stopped.");
            }
        }
        catch (Exception ex)
        {
            Tools.LogError(ex.Message);
        }

This is how I stop the service:

    private bool StopService(ServiceController scESiftServer)
    {
        int i = 0;
        if (scESiftServer.Status == ServiceControllerStatus.Running)
        {
            try
            {
                scESiftServer.Stop();
            }
            catch (Exception ex)
            {
                Tools.LogEvent("Exception ...");
                return false;
            }
            while (scESiftServer.Status != ServiceControllerStatus.Stopped && i < 120)
            {
                Thread.Sleep(1000);
                scESiftServer.Refresh();
                i++;
            }
        }
        if (scESiftServer.Status != ServiceControllerStatus.Stopped)
        {
            //This line gets executed
            Tools.LogEvent("Failed within 120 sec...");
            return false;
        }
        else
        {
            Tools.LogEvent("OK ...");
        }
        return true;
    }

Could something like this help?

var task = Task.Run(() => _ESSServiceHost.Close(TimeSpan.FromSeconds(299)));
if (!task.Wait(TimeSpan.FromSeconds(300)))
{
    _ESSServiceHost.Abort();
}

But _ESSServiceHost.Abort() should be called internally by the Close method if needed. Target framework is 4.5, installed is .NET 4.7.2.

Found out that probably the service hangs after series of malformed requests. Expected record type 'Version', found '71'. etc.


Solution

  • I have found in the svclog file that my service hangs after series of malformed request that happen on Saturday and Sunday at approx. 5:15 AM. The error messages were Expected record type 'Version', found '71'., Error while reading message framing format at position 0 of stream (state: ReadingVersionRecord). But I could not find the cause of theese malformed request series, so I tried to fix the service to withstand the "attack".

    I have modified the OnStop method as follows:

    protected override void OnStop()
    {
        try
        {
            if (_ESSServiceHost != null)
            {
                Tools.LogInfo("Stopping ESService.");
                var abortTask = Task.Run(() => _ESSServiceHost.Abort());
                var closeTask = Task.Run(() => _ESSServiceHost.Close(TimeSpan.FromSeconds(300)));
                try
                {
                    if (_ESSServiceHost.State == CommunicationState.Faulted)
                    {
                        Tools.LogInfo("ESSServiceHost.State == CommunicationState.Faulted");
                        if (!abortTask.Wait(TimeSpan.FromSeconds(60)))
                            Tools.LogInfo("Failed to Abort.");
                    }
                    else
                    {
                        if (!closeTask.Wait(TimeSpan.FromSeconds(301)))
                        {
                            Tools.LogInfo("Failed to Close - trying Abort.");
                            if (!abortTask.Wait(TimeSpan.FromSeconds(60)))
                                Tools.LogInfo("Failed to Abort.");
    
                        }                            
                    }
                }
                catch (Exception ex)
                {
                    Tools.LogException(ex, "ESSServiceHost.Close");
                    try
                    {
                        Tools.LogInfo("Abort.");
                        if (!abortTask.Wait(TimeSpan.FromSeconds(60)))
                            Tools.LogInfo("Failed to Abort.");
                    }
                    catch (Exception ex2)
                    {
                        Tools.LogException(ex2, "ESSServiceHost.Abort");
    
                    }
                }
                _ESSServiceHost = null;
                Tools.LogInfo("ESService stopped.");
            }
        }
        catch (Exception ex)
        {
            Tools.LogException(ex,"OnStop");
        }
    }
    

    Today on Monday I have checked the svclog and the "attacks" with malformed request remained there but my service lived happily through it. So it seemed to be fixed. Moreover only:

    Stopping ESService.

    ESService stopped.

    events were logged in my log file. No aborts etc. So I guess that putting the Close call on a separate thread fixed the problem but absolutely do not know why.