Given:
- 1 machine (laptop) the load tests are running from with Internet speed 10 MB/sec
- there is the service which acts as proxy: it redirects traffic to AWS S3. That's it.
- the service is running in AWS EKS cluster
- there is 1 POD of the service
- there are 10 EC2 (m5.8xlarge) instances as compute resources in the EKS cluster
- m5.8xlarge supports 10 Gigabit network which is equal (roughly) 1 GigaByte
- Gatling is used for load testing
Test scenario:
- 1000 users in parallel upload 1MB file to AWS S3 via the service
EC2 monitoring shows up the 400MB network in during the load test.

Container metric container_network_receive_bytes_total
shows up 10 MB/sec (roughly) income traffic.

How to assess max number of users sending 1MB file each, which the infrastructure above may support?
My thoughts (rough estimation) are as follows:
- 1000 users which are sending 1MB file create the container income traffic = 10 MB/s
- 100000 users might create the container income traffic = 1000 MB/s (1GB/sec)
- the maximum network bandwidth of EC2 instance is 1GB
Conclusion: the infrastructure can support max 100000 users which send 1MB file each.
Am I right at least roughly?
Of course there could be many "but".
E.g. I'm using 1 test machine with the limited Internet speed (10MB/sec).
So, that 1000 users which are up and sending 1 MB file from the test machine are limited by the laptop Internet speed. Most likely, if the speed was higher or each of 1000 users was run on separate machine, the container bandwidth and EC2 network in would be much higher.
Run a stress test.
Start with 1 virtual user and gradually increase the load at the same time monitoring metrics like response time, number of transactions per second and number of errors.
In the ideal system the number of transactions per second should increase proportionally to the increasing load, however at some point you will observe that as you're increasing the load the number of transactions per second remains the same or even goes down and response time goes up (or errors start occurring or system under test crashes)
The number of virtual users which were "active" by that time will be the response you're looking for.
Few things to consider:
- Make sure that each virtual user behaves like a real user, i.e. I doubt that the real user will be uploading files non-stop 9 to 5 without bio brakes so set up your script to carefully mimic the real user behaviour, i.e. how often the file is being uploaded, is the connection kept open between uploads, any other related actions like authentication/authorization, etc.
- You cannot extrapolate the results of running a "smaller" test onto a "bigger" system, if you're limited with your load generator bandwidth - find another load generator and run Gatling on 2 machines at the same time, the approach is described in Gatling - Scaling Out section
- You might also want to test your system capability to scale out when the load increases (and scale down when the load decreases) and measure the scalability factor, i.e. if 1 instance can handle 1000 users how many users will 2 instances handle?