Getting Locust to send a predefined distribution of requests per second

I previously asked this question about using Locust as the means of delivering a static, repeatable request load to the target server (n requests per second for five minutes, where n is predetermined for each second), and it was determined that it's not readily achievable.

So, I took a step back and reformulated the problem into something that you probably could do using a custom load shape, but I'm not sure how – hence this question.

As in the previous question, we have a 5-minute period of extracted Apache logs, where each second, anywhere from 1 to 36 GET requests were made to an Apache server. From those logs, I can get a distribution of how many times a certain requests-per-second rate appeared; e.g. there's a 1/4000 chance of 36 requests being processed on any given second, 1/50 for 18 requests to be processed on any given second, etc.

I can model the distribution of request rates as a simple Python list: the numbers between 1 and 36 appear in it an equal number of times as 1–36 requests per second were made in the 5-minute period captured in the Apache logs, and then just randomly get a number from it in the tick() method of a custom load shape to get a number that informs the (user count, spawn rate) calculation.

Additionally, by using a predetermined random seed, I can make the test runs repeatable to within an acceptable level of variation to be useful in testing my API server configuration changes, since the same random list elements should be retrieved each time.

The problem is that I'm not yet able to "think in Locust", to think in terms of user counts and spawn rates instead of rates of requests received by the server.

The question becomes this:

How do you implement the tick() method of a custom load shape in such a way that the (user count, spawn rate) tuple results in a roughly known distribution of requests per second to be sent, possibly with the help of other configuration options and plugins?

Solution

You need to create a Locust User with the tasks you want it to run (e.g. make your http calls). You can define time between tasks to kind of control the requests per second. If you have a task to make a single http call and define wait_time = constant(1) you can roughly get 1 request per second. Locust's spawn_rate is a per second unit. Since you have the data you want to reproduce already and it's in 1 second intervals, you can then create a LoadTestShape class with the tick() method somewhat like this:

class MyShape(LoadTestShape):
    repro_data = […]
    last_user_count = 0
    def tick(self):
        self.last_user_count = requests_per_second
        if len(self.repro_data) > 0:
            requests_per_second = self.repro_data.pop(0)
            requests_per_second_diff = abs(last_user_count - requests_per_second)
            return (requests_per_second, requests_per_second_diff)
        return None

If your first data point is 10 requests, you'd need requests_per_second=10 and requests_per_second_diff=10 to make Locust spin up all 10 users in a single second. If the next second is 25, you'd have requests_per_second=25 and requests_per_second_diff=15. In a Load Shape, spawn_rate also works for decreasing the number of users. So if next is 16, requests_per_second=16 and requests_per_second_diff=9.