Apache: how to setup in Apache/mod_proxy an upper bound on the number of concurrent forwarded requests?

I have a fairly standard setup where a front-end Apache server forwards requests to Tomcat through mod_proxy/AJP. How can I setup Apache/mod_proxy so it only forwards at most N (say, N=4) concurrent requests to Tomcat? Other concurrent requests coming into Apache should not be rejected, and should instead be queued to later be sent to Tomcat.

PS 1: Note that this is something you can do this at Tomcat level with the maxThreads attribute, but I prefer to handle this at the Apache level.

PS 2: I see that Apache has a MaxClients configuration, which seems to be doing what I am looking for. But it is not clear to me how to have a MaxClient per server mod_proxy forwards to, rather than MaxClient per Apache. I.e. if Apache forward requests to a cluster of 4 Tomcat machine, I'd like Apache to limit the number of concurrent requests forwarded to any given Tomcat to N (say, N=4).

Solution

The solutions is mod_proxy by adding parameters to ProxyPass directives. What you want to set is probably the max. This however will throw an error instantly and not queue your requests when you hit the max.

If you really want to queue you have to use also mod_proxy_balancer. For example allow maximum 4 connections:

ProxyPass / balancer://appservers/ stickysession=JSESSIONID|jsessionid nofailover=On
<Proxy balancer://appservers>
    BalancerMember ajp://192.168.0.100:8009 max=4
    BalancerMember ajp://192.168.0.101:8009 max=4
    BalancerMember ajp://192.168.0.102:8009 max=4
    BalancerMember ajp://192.168.0.103:8009 max=4
</Proxy>

Unfortunately, with in Apache, the value of max is per process. So, you can only effectively limit the number of connections to your back-end servers if Apache has one process and uses threads instead of processes to handle multiple connections, which depends on what MPM is being used by Apache:

On Windows, you should be all good and most likely don't have to worry about this, as the winnt MPM uses one process which in turn creates threads to handle requests.
On UNIX, if you're using the Apache that came with your OS, unfortunately there is a good chance you have prefork MPM Apache, which creates one process per request, and with which the max parameter wouldn't work:
1. To check what MPM you have, run apachectl -l.
2. In the list, if you see worker.c or event.c, then you are almost good: you now just need to make sure that Apache creates only one process. For this, set ThreadsPerChild and MaxClients to the same value, which will be the total number of concurrent connections your Apache will be able to process. Also set ServerLimit to 1.
3. In the list, if you see prefork.c, then you first need to replace your Apache with the worker or event MPM Apache. You can do so by either recompiling Apache yourself (the MPM is not a run-time configuration parameter), or getting a existing package for your platform. Then, go to step two.