Search code examples
windows-servicesmsmqreboot

MSMQ, Windows Service and Server Restarts


I am working with a legacy Windows Service that reads messages from a private MSMQ queue processes them (does some database work, sends some emails) and then waits for the next message (PeekCompleted)

The service is problematic - whenever Windows Update requires a server reboot (so like almost always) the Service comes back up in a "Started" condition but has to be REstarted manually or the messages just pile up in the queue.

My first inclination is to think that there is something in the OnStart handler that isn't getting hit when the server comes back up and I am attempting to sort out the Logs (another story) but Windows Services and threading are not my normal domain so I am hoping someone can point me in the right direction....

Below are the OnStart Handler and message handling function, stripped inconsequential stuff.

Question: in OnStart the MessageRecieved function is attached to the PeekCompleted event. I assume OnStart fires when the server comes back up so the handler must get attached, but I am not clear whether message that were (a) already in the queue at re-boot or (b) arrive during re-boot will actually trigger the event ?

If it should is there something else I should be looking for?

Any suggestions welcome!

protected override void OnStart(string[] args)
    {
        try
        {
            _inProcess = false;
            _queueMessage = null;
            _stopping = false;
            _queue = ReadyQueue(_queueName);
            if (_queue == null)
            {
                throw new Exception(string.Format("'ReadyFormQueue({0})' returned null", _queueName));
            }
            _queue.PeekCompleted += new PeekCompletedEventHandler(MessageReceived);
            _queue.Formatter = new BinaryMessageFormatter();
            _queue.BeginPeek();
        }
        catch (Exception exception)
        {
            //do cleanup and other recovery stuff
        }
    }


    private void MessageReceived(object sender, PeekCompletedEventArgs e)
    {
        _currentMessage = null;
        _inProcess = false;
        try
        {
            _queueMessage = _queue.EndPeek(e.AsyncResult);
            _queueMessage.Formatter = new BinaryMessageFormatter();
            _currentMessage = (MyMessageType)_queueMessage.Body;
            _queue.ReceiveById(_queueMessage.Id);
            _inProcess = true;
            _helper = new MessageHelper();
            _currentMessage = _helper.Process(_currentMessage);  //sets global _inProcess flag
            if (_inProcess)
            {
                Thread.Sleep((int)(_retryWaitTime * 0x3e8));
                SendFormMessageToQueue(FailedQueueName, _currentMessage);
            }
            else
            {
                _queue.BeginPeek();
            }


        }
        catch (Exception exception)
        {
            _inProcess = false;
            //do other recovery stuff
            if (_currentMessage != null)
            {
                ReadyFormQueue(_poisonQueueName);
                SendFormMessageToQueue(_poisonQueueName, _currentMessage);
            }
        }
    }

Solution

  • This legacy windows service could be started before the queueing infrastructure is up and fully operational, must fail in the initial connection and therefore isn't processing messages.

    The first thing that I would check (unless the windows service has proper logging) is if there is a windows service dependency that is properly set up - you don't want your legacy service to fully start until the MSMQ service has itself completely started.

    I don't think there is a problem in the legacy service per say since once you restart it, it seems to work fine, I think you have a resource-available-race type of problem where the consumer starts before the resource and it wasn't completely designed to recover from that.

    I would: create a service dependency (can be done in the SCM) and then reboot the server and see if you have any more MSMQ messages pilling up, my guess the answer will be no.

    Hope this helps