Search code examples
c#windows-servicestopshelf

How does Topshelf prevent Windows Service from stopping


Trying to replicate the behavior of preventing Stop command in a .NET 7 Windows Service. As I read Topshelf code, their service host is subclassing ServiceBase and in the OnStop override, if the _serviceHandle.Stop(this) returns false, it throws exception. This sets ExitCode = 1064 and rethrows.

    public class WindowsServiceHost :
        ServiceBase,
        Host,
        HostControl
    {

        // omitted for brevity

        protected override void OnStop()
        {
            try
            {
                _log.Info("[Topshelf] Stopping");

                if (!_serviceHandle.Stop(this))
                    throw new TopshelfException("The service did not stop successfully (return false).");

                _log.Info("[Topshelf] Stopped");
            }
            catch (Exception ex)
            {
                _settings.ExceptionCallback?.Invoke(ex);

                _log.Fatal("The service did not shut down gracefully", ex);
                ExitCode = (int) TopshelfExitCode.ServiceControlRequestFailed;
                throw;
            }
        }

Then, looking at the ServiceBase code, it simply calls OnStop and if exception is throw, just re-set the service status to 'previous' (running) and rethrow.

        private unsafe void DeferredStop () 
        {
            fixed (NativeMethods.SERVICE_STATUS *pStatus = &status) 
            {
                int previousState = status.currentState;

                status.checkPoint = 0;
                status.waitHint = 0;
                status.currentState = NativeMethods.STATE_STOP_PENDING;
                NativeMethods.SetServiceStatus (statusHandle, pStatus);
                try 
                {
                    OnStop();

                    // omitted for brevity...
                }
                catch (Exception e) 
                {
                    status.currentState = previousState;
                    NativeMethods.SetServiceStatus (statusHandle, pStatus);
                    WriteEventLogEntry (Res.GetString (Res.StopFailed, e.ToString ()), EventLogEntryType.Error);
                    throw; 
                }
            } // fixed
        } // DeferredStop

I've created my own class for .NET 7 deriving from WindowsServiceLifetime (which itself derives from ServiceBase) with the following 'same' code, but after I throw exception in OnStop the service does stop.

[System.Runtime.Versioning.SupportedOSPlatform( "windows" )]
public class CancelableWindowsServiceLifetime : Microsoft.Extensions.Hosting.WindowsServices.WindowsServiceLifetime
{
    private readonly ICancelableBackgroundService cancelableBackgroundService;

    public CancelableWindowsServiceLifetime( IHostEnvironment environment, IHostApplicationLifetime applicationLifetime, ICancelableBackgroundService cancelableBackgroundService, ILoggerFactory loggerFactory, IOptions<HostOptions> optionsAccessor )
        : this( environment, applicationLifetime, cancelableBackgroundService, loggerFactory, optionsAccessor, Options.Create( new WindowsServiceLifetimeOptions() ) )
    {
    }

    public CancelableWindowsServiceLifetime( IHostEnvironment environment, IHostApplicationLifetime applicationLifetime, ICancelableBackgroundService cancelableBackgroundService, ILoggerFactory loggerFactory, IOptions<HostOptions> optionsAccessor, IOptions<WindowsServiceLifetimeOptions> windowsServiceOptionsAccessor )
        : base( environment, applicationLifetime, loggerFactory, optionsAccessor, windowsServiceOptionsAccessor )
    {
        this.cancelableBackgroundService = cancelableBackgroundService;
    }

    protected override void OnStop()
    {
        if ( !cancelableBackgroundService.CanStop() )
        {
            ExitCode = 1064; // ERROR_EXCEPTION_IN_SERVICE

            throw new ApplicationException( $"Unable to stop service at this time." );
        }

        base.OnStop();
    }
}

I looked at the .NET 7 version of ServiceBase and don't see any real differences in the implementation. Any ideas why my version is not working?

Essentially, I want to prevent stopping my service in case jobs are running, however, I do have a safe guard in place that if stop is requested enough times in a sliding window, it'll permit the stop to happen and terminate the running job.


Solution

  • In case anyone wants to know the solution I'm currently using, this is how I achieved my goal:

    By default, prevent the user (normally me) from stopping the service if there are jobs running, but allow for a forced shutdown if needed (event if there are jobs running).

    The basic idea was to follow the pattern of .NET Core websites and AppOffline.htm file presence. When a 'Stop' request comes in, delay the stopping of the service indefinitely while there were jobs running or the existence of a SvcOffline.txt file. That way, if a user requests a stop, it'll be 'prevented', however, if they need to force the shutdown regardless of running jobs, they can simply drop a SvcOffline.txt file in the service directory to permit the shutdown.

    The experience in SCM was as follows:

    1. Start service. (envision that jobs start processing and are long running).
    2. Attempt to stop the service. SCM will display the message:

    Windows is attempting to stop the following service on Local Computer...KAT FTP Service

    1. If the jobs finish (or SvcOffline.txt exists) before the 'failed to stop' message is displayed, everything shuts down gracefully.

    2. If the service does not stop before the following message, SCM displays a status of 'Stopping' and once the service does shutdown, SCM removes the status flag.

    Windows could not stop the KAT FTP Service service on Local Computer.

    Error 1053: The service did not respond to the start or control request in a timely fashion.

    Note: If the service does not allow the shutdown before the 'failed to stop' message, SCM throws an AggregateException that must be caught.

    Below is the code I used to accomplish this. Any comments are welcome.

    Program.cs

    var host = Host.CreateDefaultBuilder( args )
        .ConfigureServices( ( builder, services ) =>
        {
            services
                .AddWindowsService( options => options.ServiceName = "KAT FTP Service" )
                .AddHostedService<Worker>();
        } )
        .Build();
    
    try
    {
        host.Run(); 
    }
    catch ( Exception ex ) when ( ex is AggregateException aggEx && aggEx.InnerException is OperationCanceledException )
    {
        // Ignore this error.  If the Service was attempted to be stopped and blocked the stop long enough for SCM to
        // display an 'Unable to stop' message, this exception will be thrown, but if the InnerException is OperationCanceledException,
        // meaning that the service either finished or permitted the stop via SvcShutdown.txt, just ignore.
    }
    

    Worker.cs

    public partial class Worker : BackgroundService
    {
        private readonly IHostEnvironment hostEnvironment;
        private readonly ILogger<Worker> logger;
        private readonly List<FileWatcherNotification> monitors = new();
        private ServiceSettings settings;
    
        public Worker( IOptionsMonitor<ServiceSettings> settingsMonitor, IHostApplicationLifetime hostApplicationLifetime, IHostEnvironment hostEnvironment, ILogger<Worker> logger )
        {
            this.hostEnvironment = hostEnvironment;
            this.logger = logger;
            settings = settingsMonitor.CurrentValue;
    
            // Delay service stop indefinitely until jobs are finished or forced shutdown
            hostApplicationLifetime.ApplicationStopping.Register( () =>
            {
                StopMonitors(); // prevent new notifications from being processed
    
                while ( !CanStop )
                {
                    Task.Delay( 10 * 1000 ).Wait();
                }
    
                DisposeMonitors();
            } );
        }
    
        protected override async Task ExecuteAsync( CancellationToken stoppingToken )
        {
            try
            {
                await InitializeMonitorsAsync( settings );
    
                // This service simply reacts to FileWatcherNotifications so nothing
                // to do here except wait for the service to stop.
    
                var tcs = new TaskCompletionSource();
                using var registration = stoppingToken.Register( s => ( (TaskCompletionSource)s! ).SetResult(), tcs );
                await tcs.Task.ConfigureAwait( false );
            }
            catch ( Exception ex ) when ( ex is not OperationCanceledException)
            {
                logger.LogError( ex, "{message}", ex.Message );
                Environment.Exit( 1 );
            }
        }
    
        private bool shutdownBlockWarned;
        public bool CanStop
        {
            get
            {
                var workerCount = monitors.Sum( m => m.CurrentWorkers );
    
                if ( workerCount == 0 ) return true;
    
                if ( !shutdownBlockWarned )
                {
                    LogWarningBlockingShutdown( workerCount );
                    shutdownBlockWarned = true;
                }
    
                var svcOfflinePath = Path.Combine( hostEnvironment.ContentRootPath, "SvcOffline.txt" );
                if ( File.Exists( svcOfflinePath ) )
                {
                    var filesProcessing = monitors.SelectMany( m => m.ProcessingFiles ).Join( Environment.NewLine );
                    LogErrorForcedShutdown( workerCount, filesProcessing );
    
                    File.Move( svcOfflinePath, Path.Combine( hostEnvironment.ContentRootPath, "_SvcOffline.txt" ) );
    
                    return true;
                }
    
                return false;
            }
        }
    }