Search code examples
pythonwebsockettwilioload-balancingivr

Using a Load Balancer with Twilio for IVR App


I'm architecting an IVR app that will be built using a Python framework (e.g. Flask, etc.). The app receives an audio stream from Twilio via Websockets when a user calls a designated number that is defined within the Twilio user dashboard. Concurrently, Twilio also makes POST requests to a webhook provided by me. How do I add a load balancer to work with Twilio so in the event that a started call overwhelms resources on 1 containerized instance of the app, the load balancer transfers that call with all its state to another containerized app instance without the user noticing anything? (seamless to the user) Has anyone done this before? If so, could you please share your experience. Much appreciated. Thanks.


Solution

  • I don't think you can easily achieve what you described of "moving" an in-progress call that caused an overload to a new container. But, here are some ideas for how to handle/prevent an overload.

    Option 1 - Load balancing proactively. Send x% of the incoming requests to one container and x% to another container. This "round-robin" is implemented by you prior to the call starting. This means each container gets x% of the traffic and you always have more capacity than is needed. Add more containers based on your monitoring of the used capacity.

    Option 2 - Use a backend service that scales automatically. For example, I use Azure Functions which will scale automatically (pretty quickly too).

    Option 3 - Wait for an overload to occur, sacrifice that call, and then move subsequent calls to an unused container using the Twilio fallback url.

    Twilio's voice urls have a primary and fallback url. The fallback url will be used when the primary url fails. The fail could be due to overload or unavailability. Your application will need to return the HTTP response that indicates failure. The documentation from Twilio on Availability and Reliability may give you some ideas.

    You could also detect an overload, try to Stop the Stream and then re-start the Stream to a healthy container. Figuring out how to recover from a failure while experiencing the failure sounds tricky to implement and test.