Openshift 4.x - Requests more than concurrency property value

We use Openshift 4.x. For an API, min pods is 5 and max is 8. Horizontal autoscaling is configured based on the avg. CPU utilization percentage. The property haproxy.router.openshift.io/pod-concurrent-connections = '10' --> restricts the number of connections to each Pod to 10. What happens if we get more requests to the pod? Does it wait in the queue or does the pods scale up horizontally? Below is current configuration in Routes for this API: haproxy.router.openshift.io/disable_cookies: 'true' haproxy.router.openshift.io/balance: roundrobin haproxy.router.openshift.io/pod-concurrent-connections: '10' haproxy.router.openshift.io/timeout: 50s

Solution

The HPA will spawn a new pod if the defined CPU limit is reached, not based the number of connections
So to answer your question "What happens if we get more requests to the pod?", the answer is

yes if at that time the 10 connections per pod make the pods consume more CPU that the defined CPU limit in HPA,
otherwise no and so the requests will be queued in HAProxy

The HPA may also spawn another pod before that, if the CPU goes above the defined limit. The 2 metrics (# of connections and CPU consumption) are not related in the context of k8s/OCP

Keep in mind that the CPU limit defined in the HPA is the average CPU observed for all the running pods at the time