I have defined a custom LoadBalancerProbe for my webrole as follows
<?xml version="1.0" encoding="utf-8"?>
<ServiceDefinition name="CloudService" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition" schemaVersion="2013-03.2.0">
<LoadBalancerProbes>
<LoadBalancerProbe name="MyProbe" protocol="http" intervalInSeconds="15" path="/api/ping" port="80" timeoutInSeconds="30" />
</LoadBalancerProbes>
<WebRole name="TestApp" vmsize="Small">
<Sites>
<Site name="Web">
<Bindings>
<Binding name="Endpoint1" endpointName="Endpoint1" />
</Bindings>
</Site>
</Sites>
<Endpoints>
<InputEndpoint name="Endpoint1" protocol="http" port="80" localPort="80" loadBalancerProbe="MyProbe"/>
</Endpoints>
<Imports>
<Import moduleName="Diagnostics" />
<Import moduleName="RemoteAccess" />
<Import moduleName="RemoteForwarder" />
</Imports>
</WebRole>
</ServiceDefinition>
When in Azure I have 2 instances. I have enabled trace.axd and can see the load balancer calling the ping method, so that is definitely happening.
I can also see my "503" responses (Server Unavailable) in my test app when I want my instance to appear down (I change a config setting on the instance). I can see custom HTTP Headers from the load balancer X-MS-LB-MonitorStatus Down
.
When I use a Curl request to access the load-balanced url, it always returns the correct results (If I have set an instance to return 503 rather than 200, it does not appear in the response results).
When I use a browser however (in this case Chrome) I can still get results back from the instance that is supposed to be down (i.e. the instance was available, I disable it, then additional calls to the load balanced url still resolve to the "disabled" instance).
I can confirm the actual instances that resolved each request using trace.axd information
I'm struggling to believe that azure is doing load balancing properly here.
The Azure load balancer is a layer 3 load balancer and only load balances new incoming TCP connections. It does not know anything about HTTP traffic.
Typically a browser will establish a TCP connection with keep-alive set to true and will keep that TCP connection open for a period of time, and any subsequent requests to the website will just be HTTP traffic over the existing TCP connection. Application such as curl will typically close the TCP connection after every request.
So in your case the Azure load balancer is behaving correctly, but your browser already has a TCP connection established to the instance that is out of rotation, so future HTTP requests will still go to the same out of rotation instance.
To validate that this is what is happening you can use netmon/wireshark on the client side or the server side.
To resolve this you have a few options:
See the 3rd Q&A at http://blogs.msdn.com/b/kwill/archive/2013/02/28/heartbeats-recovery-and-the-load-balancer.aspx for a little more information.