Jetty Async not working as anticipated, experiencing worse performance than sync

Thanks in advance for any pointers or help.

Basically I was expecting the async version of this to perform far better than the sync version, but instead the sync version performs equivalent or somehow better.

Am I doing something wrong, what gives? I tried without Javalin in case something in the framework was creating a problem, seemed to give similar results. I did try this with just Netty (too long to post the code) and I experienced similar results also.

I wrote the following code: (javalin-3.12.0 and jetty-9.4.31.v20200723)

import io.javalin.Javalin;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.util.thread.QueuedThreadPool;
import java.io.IOException;
import java.util.concurrent.ScheduledThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class AsyncTest {
    static ScheduledThreadPoolExecutor scheduledThreadPoolExecutor = new ScheduledThreadPoolExecutor(5000);
    public static void main(String[] args) {
        var jav = Javalin.create();
        jav.config.server(() -> new Server(new QueuedThreadPool(5000, 500, 120_000)));
        Javalin app = jav.start(8080);

        app.get("/async-delay", ctx -> {
            var async = ctx.req.startAsync();
            scheduledThreadPoolExecutor.schedule(() -> {
                try {
                    ctx.res.getOutputStream().println("ok");
                } catch (IOException e) {
                    e.printStackTrace();
                }
                async.complete();

            }, 100, TimeUnit.MILLISECONDS);
        });

        app.get("/delay", ctx -> {
            Thread.sleep(100);
            ctx.result("ok");
        });

        app.get("/no-delay", ctx -> {
            ctx.result("ok");
        });
    }
}

And got the following results:

➜  ~ wrk2 -t16 -c300 -d5s -R3000 http://localhost:8080/delay
Running 5s test @ http://localhost:8080/delay
  16 threads and 300 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   331.36ms  138.72ms 626.18ms   57.34%
    Req/Sec        nan       nan   0.00      0.00%
  10854 requests in 5.00s, 1.24MB read
  Socket errors: connect 53, read 0, write 0, timeout 106
Requests/sec:   2170.40
Transfer/sec:    254.34KB
➜  ~ wrk2 -t16 -c300 -d5s -R3000 http://localhost:8080/async-delay
Running 5s test @ http://localhost:8080/async-delay
  16 threads and 300 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   285.84ms  120.75ms 522.50ms   56.27%
    Req/Sec        nan       nan   0.00      0.00%
  11060 requests in 6.10s, 1.29MB read
  Socket errors: connect 53, read 0, write 0, timeout 124
Requests/sec:   1814.16
Transfer/sec:    216.14KB
➜  ~ wrk2 -t16 -c16 -d5s -R70000 http://localhost:8080/no-delay
Running 5s test @ http://localhost:8080/no-delay
  16 threads and 16 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.51ms    3.12ms  21.95ms   88.36%
    Req/Sec        nan       nan   0.00      0.00%
  349824 requests in 5.00s, 40.03MB read
Requests/sec:  69995.44
Transfer/sec:      8.01MB

Solution

Since Jetty 9+ is 100% async from the get go, this lack of difference makes sense. (In fact, in Jetty 9+ there is extra work done to pretend to be synchronous when using synchronous APIs like InputStream.read() or OutputStream.write())

Also, your load testing workloads are not realistic.

You want many more client machines to do the testing with. There is no single software client alone is capable of stressing a Jetty server. You'll hit system resource limits well before you hit any kind of Jetty serving limits.
- At least a 4 to 1 ratio (we test with a 8 to 1 ratio) of client machines to server machines to generating enough load to stress Jetty.
You want many concurrent connections to the server. (think 40,000+)
- Or you want HTTP/2 in the picture (which also stresses out the server resources in it's own unique ways)
You want lots of data returned (something that would take multiple network buffers to return).
You want to sprinkle in a few client connections that are slow to read as well (which on a synchronous server can impact the rest of the connections that are not slow by simply consuming too many resources)