Concurrent request processing with Boost Beast

I'm referring to this sample program from the Beast repository: https://www.boost.org/doc/libs/1_67_0/libs/beast/example/http/server/fast/http_server_fast.cpp

I've made some changes to the code to check the ability to process multiple requests simultaneously.

 boost::asio::io_context ioc{1};
 tcp::acceptor acceptor{ioc, {address, port}};

 std::list<http_worker> workers;
 for (int i = 0; i < 10; ++i)
 {
     workers.emplace_back(acceptor, doc_root);
     workers.back().start();
 }

 ioc.run();

My understanding with the above is that I will now have 10 worker objects to run I/O, i.e. handle incoming connections.

So, my first question is the above understanding correct?

Assuming that the above is correct, I've made some changes to the lambda (handler) passed to the tcp::acceptor:

    void accept()
    {
        // Clean up any previous connection.
        boost::beast::error_code ec;
        socket_.close(ec);
        buffer_.consume(buffer_.size());

        acceptor_.async_accept(
            socket_,
            [this](boost::beast::error_code ec)
            {
                if (ec)
                {
                    accept();
                }
                else
                {
                     boost::system::error_code ec2;
                     boost::asio::ip::tcp::endpoint endpoint = socket_.remote_endpoint(ec2);

                    // Request must be fully processed within 60 seconds.
                    request_deadline_.expires_after(
                        std::chrono::seconds(60));

                    std::cerr << "Remote Endpoint address: " <<  endpoint.address() << " port: " << endpoint.port() << "\n";

                    read_request();
                }
            });
    }

And also in process_request():

    void process_request(http::request<request_body_t, http::basic_fields<alloc_t>> const& req)
    {
        switch (req.method())
        {
        case http::verb::get:
            std::cerr << "Simulate processing\n";
            std::this_thread::sleep_for(std::chrono::seconds(30));
            send_file(req.target());
            break;

        default:
            // We return responses indicating an error if
            // we do not recognize the request method.
            send_bad_response(
                http::status::bad_request,
                "Invalid request-method '" + req.method_string().to_string() + "'\r\n");
            break;
        }
    }

And here's my problem: If I send 2 simultaneous GET requests to my server, they're being processed sequentially, and I know this because the 2nd "Simulate processing" statement is printed ~30 seconds after the previous one which would mean that execution gets blocked on the first thread.

I've tried to read the documentation of boost::asio to better understand this, but to no avail.

The documentation for acceptor::async_accept says:

Regardless of whether the asynchronous operation completes immediately or not, the handler will not be >invoked from within this function. Invocation of the handler will be performed in a manner equivalent to >using boost::asio::io_service::post().

And the documentation for boost::asio::io_service::post() says:

The io_service guarantees that the handler will only be called in a thread in which the run(), >run_one(), poll() or poll_one() member functions is currently being invoked.

So, if 10 workers are in the run() state, then why would the two requests get queued?

And also, is there a way to workaround this behavior without adapting to a different example? (e.g. https://www.boost.org/doc/libs/1_67_0/libs/beast/example/http/server/async/http_server_async.cpp)

Solution

io_context does not create threads internally to execute the tasks, but rather uses the threads that call io_context::run explicitly. In the example the io_context::run is called just from one thread (main thread). So you have just one thread for task executions, which (thread) gets blocked in sleep and there is no other thread to execute other tasks.

To make this example work you have to:

Add more thread into the pool (like in the second example you referred to)

size_t const threads_count = 4;
std::vector<std::thread> v;
v.reserve(threads_count - 1);
for(size_t i = 0; i < threads_count - 1; ++i) { // add thraed_count threads into the pool
    v.emplace_back([&ioc]{ ioc.run(); });
}
ioc.run(); // add the main thread into the pool as well

Add synchronization (for example, using strand like in the second example) where it is needed (at least for socket reads and writes), because now your application is multi-threaded.

UPDATE 1

Answering to the question "What is the purpose of a list of workers in the Beast example (the first one that referred) if in fact io_context is only running on one thread?"

Notice, regardless of thread count IO operations here are asynchronous, meaning http::async_write(socket_...) does not block the thread. And notice, that I explain here the original example (not your modified version). One worker here deals with one round-trip of 'request-response'. Imagine the situation. There are two clients client1 and client2. Client1 has poor internet connection (or requests a very big file) and client2 has the opposite conditions. Client1 makes request. Then client2 makes request. So if there was just one worker client2 would had to wait until client1 finished the whole round-trip 'request-response`. But, because there are more than one workers client2 gets response immediately not waiting the client1 (keep in mind IO does not block your single thread). The example is optimized for situation where bottleneck is IO but not the actual work. In your modified example you have quite the opposite situation - the work (30s) is very expensive compared to IO. For that case better use the second example.