Understand how k6 manages at low level a large number of API call in a short period of time

I'm new with k6 and I'm sorry if I'm asking something naive. I'm trying to understand how that tool manage the network calls under the hood. Is it executing them at the max rate he can ? Is it queuing them based on the System Under Test's response time ?

I need to get that because I'm running a lot of tests using both k6 run and k6 cloud but I can't make more than ~2000 requests per second (looking at k6 results). I was wondering if it is k6 that implement some kind of back-pressure mechanism if it understand that my system is "slow" or if there are some other reasons why I can't overcome that limit.

I read here that is possible to make 300.000 request per second and that the cloud environment is already configured for that. I also try to manually configure my machine but nothing changed.

e.g. The following tests are identical, the only changes is the number of VUs. I run all test on k6 cloud.

Shared parameters:

60 api calls (I have a single http.batch with 60 api calls)
Iterations: 100
Executor: per-vu-iterations

Here I got 547 reqs/s:

VUs: 10 (60.000 calls with an avg response time of 108ms)

Here I got 1.051,67 reqs/s:

VUs: 20 (120.000 calls with an avg response time of 112 ms)

I got 1.794,33 reqs/s:

VUs: 40 (240.000 calls with an avg response time of 134 ms)

Here I got 2.060,33 reqs/s:

VUs: 80 (480.000 calls with an avg response time of 238 ms)

Here I got 2.223,33 reqs/s:

VUs: 160 (960.000 calls with an avg response time of 479 ms)

Here I got 2.102,83 peak reqs/s:

VUs: 200 (1.081.380 calls with an avg response time of 637 ms) // I reach the max duration here, that's why he stop

What I was expecting is that if my system can't handle so much requests I have to see a lot of timeout errors but I haven't see any. What I'm seeing is that all the API calls are executed and no errors is returned. Can anyone help me ?

Solution

As k6 - or more specifically, your VUs - execute code synchronously, the amount of throughput you can achieve is fully dependent on how quickly the system you're interacting with responds.

Lets take this script as an example:

import http from 'k6/http';

export default function() {
  http.get("https://httpbin.org/delay/1");
}

The endpoint here is purposefully designed to take 1 second to respond. There is no other code in the exported default function. Because each VU will wait for a response (or a timeout) before proceeding past the http.get statement, the maximum amount of throughput for each VU will be a very predictable 1 HTTP request/sec.

Often, response times (and/or errors, like timeouts) will increase as you increase the number of VUs. You will eventually reach a point where adding VUs does not result in higher throughput. In this situation, you've basically established the maximum throughput the System-Under-Test can handle. It simply can't keep up.

The only situation where that might not be the case is when the system running k6 runs out of hardware resources (usually CPU time). This is something that you must always pay attention to.

If you are using k6 OSS, you can scale to as many VUs (concurrent threads) as your system can handle. You could also use http.batch to fire off multiple requests concurrently within each VU (the statement will still block until all responses have been received). This might be slightly less overhead than spinning up additional VUs.