Minimum system requirements to Rasa server and limitation of number of client requests coming in parallel?

I have started exploring Rasa and planning to switch from Dialogflow to Rasa. However, as of now my several attempts to answer the following two questions by exploring the Rasa docs and previous forum posts like RASA Chatbot | System Requirement and Minimum/Recommended System Requirements for RASA(NLU+Core), went in vain as the links provided in the answers are broken. Probably, because those links are no longer valid.

So here are my two questions:

What is the minimum and recommended system requirement to host a Rasa Server?
What is the maximum no. of client request a Rasa server can process in parallel?

Thanks in advance.

Solution

What is the minimum and recommended system requirement to host a Rasa Server?

That highly depends on your model. If you are using pretrained embeddings (e.g. spaCy embeddings) than the model itself is already a couple of gigabytes big. Further, the number of used policies and the used NLU components heavily affect the performance (e.g. 1 policy is obviously faster than using 5 policies). So best set up a load test with your configuration and model.

What is the maximum no. of client request a Rasa server can process in parallel?

Rasa (1.x) uses a sanic webserver. Rasa uses 1 sanic worker, which means it runs on process. So technically there is only one request processed at the time. However, Sanic runs asynchronously which means that it can process other requests while the current request is blocked (e.g. cause it's waiting for a response from your custom action server).