How do I scale my Azure application without having a temporary outage?

I'm toying with Windows Azure Management API for scaling my Azure web role. At some point I have one instance and decide that I want to go from one instance to two instances. I send an HTTP POSt request to

https://management.core.windows.net:443/<my-subscription-id>/services/hostedservices/<my-service-name>/deployments/<my-deployment-name>/?comp=config

with an XML specifying the same configuration as deployment currently has and instances count set to two. The call succeeds and the change starts. Now for about 30 seconds the web role will not accept HTTP calls - the caller will get

10061 connection refused

in browser. Which means the role is not serving client requests. That's a problem.

How do I scale the web role in such way that it serves client requests at all times?

Solution

As per SLA (Service Level Agreement - Compute):

We guarantee that when you deploy two or more role instances in different fault and upgrade domains your Internet facing roles will have external connectivity at least 99.95% of the time.

This means that having one instance is not supported case for SLA, so you may (or will) have downtime. If scale from 2 or more, or from more to 2, there shall not be any outage.

This blog post outlines a good explanation about fault and upgrade domains. Before all, scaling means "upgrade" - you are changing configuration, this configuration change needs to be propagaded through all roles and instances. The only way to do that witout downtime (currently) is to have at least two instances, each of which lives in separate domain.