Search code examples
azureload-balancingazure-app-service-plans

Load balancing setup in Azure App Service Plan for frontend and backend communication


I have an application running on an Azure App Service Plan that consists of two simple parts:

A frontend hosted on a slot. A backend hosted on a slot. The frontend communicates with the backend via HTTP, and the backend's endpoint is hardcoded in the frontend's configuration.

I want to scale out this application, but I'm unsure about the best approach to achieve this.

I've been reading about Azure App Service Plan scale-out options, but I'm not sure how it fits with my current setup. From what I understand:

I can create a scale-out rule that triggers when the App Service Plan's CPU reaches a certain threshold (e.g., XX%). Once the threshold is met, Azure will launch another instance of the App Service Plan. Now, let's say a user navigates to mycustomdomainname.com. What happens behind the scenes? My current configuration involves an App Service Plan with one frontend and one backend, and the frontend has a custom domain name associated with it. My questions are as follows:

  • How does the newly launched App Service Plan instance know which domain name to use?
  • Do I need to remove the frontend from the original App Service Plan to avoid duplicating the frontend, considering I don't need multiple frontend instances?
  • Can I specify which application should be scaled out within the App Service Plan? (As far as I understand, all applications within the App Service Plan are scaled out together.)
  • After the new instance is up and running, how can Azure direct a connected user to contact the correct backend API (A or B) from the frontend? Additionally, how can I configure my frontend to use a generic API URL that is load-balanced over the multiple backend instances?

I'm also open to other suggestions regarding the architecture. Should I consider creating a dedicated load balancer in Azure to handle the distribution of requests from frontend to backend? And If so, does App Service Plan's is the right way ?

Thank you in advance for your help!


Solution

  • How does the newly launched App Service Plan instance know which domain name to use?

    Auto-scale knows nothing about domain names. It monitors the statistics for your app service plan, which you can think of as a server cluster. Each VM in that cluster runs all applications that are in the same plan. If a scale rule condition is reached, the auto-scaler will add/remove VMs from the cluster based on that. If a VM is added, all applications within the plan are setup to run on it.

    Do I need to remove the frontend from the original App Service Plan to avoid duplicating the frontend, considering I don't need multiple frontend instances?

    Having a separate plan could be a good idea if the scaling requirements are different. Auto-scale affects all applications within one plan in the same way.

    Can I specify which application should be scaled out within the App Service Plan? (As far as I understand, all applications within the App Service Plan are scaled out together.)

    No. All applications run on each VM.

    After the new instance is up and running, how can Azure direct a connected user to contact the correct backend API (A or B) from the frontend? Additionally, how can I configure my frontend to use a generic API URL that is load-balanced over the multiple backend instances?

    This is already done for you. App Service includes a load balancer in front of all the VMs. This is what you connect to when connecting to the URL. The auto-scaler will let the load balancer know of changes to the VMs, so when one is added, future requests may be routed there.

    Small side note: if you have "ARR affinity" enabled in the App Service configuration, a cookie sent in the first response that sets the VM to connect to on later requests by the same client. It is not recommended to enable this unless you were migrating a legacy application that depends on in-memory session data for example. Check that you have this disabled to ensure requests are routed evenly between the instances.