We have a third party product run as a windows service, expose as a web service. The goal is to dynamically provision the service instances in business peak hours.
Just to run the thought with you guys, - I've already deployed the service on multiple vm, configured the vm in the same cloud service Availability Sets, configured azure to turn on/off vm instances based on cpu use - I am to configure a separate vm, run iss arr there, add points to the endpoints on the vm configured above, with the hope ARR balanced the requests to the back-end vm dynamically
Will this work? What's the best practice for the IaaS scale? Any thoughts? Truly appreciate the input.
If I have understood correctly, you just need to use the built in load balancer of the cloud service. Create a load balance set for your endpoint. For example, if you want to balace the incoming traffic to port 80 in your application all you have to do is to create a LB-set for this port and configure this set to all the VMs in the Cloud Service.
The Azure Load Balancer randomly distributes a specific type of incoming traffic across multiple virtual machines or services in a configuration known as a load-balanced set. For example, you can spread the load of web request traffic across multiple web servers or web roles.
Azure load balancing for virtual machines
No matter if VMs are up or down, once it turns on and if the endpoint is configured in the same LB-set, it will automatically start responding to requests once port 80 is online (IIS started and is returning STATUS 200 OK, for example). So, answering your question: yes, it will work with auto-scale or manuallying turning on/off vms.