Search code examples
azureazure-worker-rolesazure-vm-role

Azure VMs restart unexpectedly


This is a problem is related to worker role hosted VM. I have a simple worker role, which spans a process inside of it. The process spawned is the 32 bit compiled TCPServer application. Worker role has a endpoint defined in it, the TCPserver is bound to the end point of the Worker role. So when I connect to my worker role endpoint, and send something, TCPserver recieves it , processes it returns something back. So here the endpoint of the worker role which is exposed to outside world, internally connects to TCPserver.

string port = RoleEnvironment.CurrentRoleInstance.InstanceEndpoints[""TCPsocket].IPEndpoint.Port.ToString();

            var myProcess = new Process()
            {
                StartInfo = new ProcessStartInfo(Path.Combine(localstorage.RootPath, "TCPServer.exe"))
                {
                    CreateNoWindow = true,
                    UseShellExecute = true,
                    WorkingDirectory = localstorage.RootPath,
                    Arguments = port
                }
            };

It was working fine. But suddenly sever stopped to respond. When I checked in portal, VM role was restarting automatically. But it never succeeded. It was showing Role Initializing.. status. Manual stop and start also din't work. I redeployed the same package without any change in the code. This time deployment itself failed.

Warning: All role instances have stopped - There was no endpoint listening at https://management.core.windows.net/<SubscriptionID>/services/hostedservices/TCPServer/deploymentslots/Production that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details.

But after some time again I tried to deploy, it worked fine. Can anyone tell me what would be the problem?

Update:

  public override void Run()
    {
        Trace.WriteLine("RasterWorker entry point called", "Information");
        string configVal = RoleEnvironment.GetConfigurationSettingValue("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString");
        CloudStorageAccount _storageAccount = null;
        _storageAccount = CloudStorageAccount.Parse(configVal); // accepts storage cridentials and create storage account
        var localstorage = RoleEnvironment.GetLocalResource("MyLocalStorage");
        CloudBlobClient _blobClient = _storageAccount.CreateCloudBlobClient();
        bool flag = false;

        while (true)
        {
            Thread.Sleep(30000);
            if (!flag)
            {
                if (File.Exists(Path.Combine(localstorage.RootPath, "test.ppm")))
                {
                    CloudBlobContainer _blobContainer = _blobClient.GetContainerReference("reports");
                    CloudBlob _blob = _blobContainer.GetBlobReference("test.ppm");
                    _blob.UploadFile(Path.Combine(localstorage.RootPath, "test.ppm"));
                    Trace.WriteLine("Copy to blob done!!!!!!!", "Information");
                    flag = true;
                }
                else
                {
                    Trace.WriteLine("Copy Failed-> File doesnt exist!!!!!!!", "Information");
                }
            }
            Trace.WriteLine("Working", "Information");
        }
    }

Solution

  • To prevent your worker role to be restart you'll need to block the Run method of your entry point class.

    If you do override the Run method, your code should block indefinitely. If the Run method returns, the role is automatically recycled by raising the Stopping event and calling the OnStop method so that your shutdown sequences may be executed before the role is taken offline.

    http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.windowsazure.serviceruntime.roleentrypoint.run.aspx

    You need to make sure that, whatever happens, you never return from the Run method if you want to keep the role alive.

    Now, if you're hosting the TCPServer in a console application (I'm assuming you're doing this since you pasted the Process.Start code), you'll need to block the Run method after starting the process.

    public override void Run()
    {
       try
       {
          Trace.WriteLine("WorkerRole entrypoint called", "Information");
    
          var myProcess = new Process()
            {
                StartInfo = new ProcessStartInfo(Path.Combine(localstorage.RootPath, "TCPServer.exe"))
                {
                    CreateNoWindow = true,
                    UseShellExecute = true,
                    WorkingDirectory = localstorage.RootPath,
                    Arguments = port
                }
            };
            myProcess.Start();
    
          while (true)
          {
             Thread.Sleep(10000);
             Trace.WriteLine("Working", "Information");
          }
          // Add code here that runs in the role instance
       }
       catch (Exception e)
       {
          Trace.WriteLine("Exception during Run: " + e.ToString());
          // Take other action as needed.
       }
    }
    

    PS: This has nothing to do with your deployment issue, I assume this was a coincidence