Search code examples
javaasynchronousjettynettyjavalin

Jetty Async not working as anticipated, experiencing worse performance than sync


Thanks in advance for any pointers or help.

Basically I was expecting the async version of this to perform far better than the sync version, but instead the sync version performs equivalent or somehow better.

Am I doing something wrong, what gives? I tried without Javalin in case something in the framework was creating a problem, seemed to give similar results. I did try this with just Netty (too long to post the code) and I experienced similar results also.

I wrote the following code: (javalin-3.12.0 and jetty-9.4.31.v20200723)

import io.javalin.Javalin;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.util.thread.QueuedThreadPool;
import java.io.IOException;
import java.util.concurrent.ScheduledThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class AsyncTest {
    static ScheduledThreadPoolExecutor scheduledThreadPoolExecutor = new ScheduledThreadPoolExecutor(5000);
    public static void main(String[] args) {
        var jav = Javalin.create();
        jav.config.server(() -> new Server(new QueuedThreadPool(5000, 500, 120_000)));
        Javalin app = jav.start(8080);

        app.get("/async-delay", ctx -> {
            var async = ctx.req.startAsync();
            scheduledThreadPoolExecutor.schedule(() -> {
                try {
                    ctx.res.getOutputStream().println("ok");
                } catch (IOException e) {
                    e.printStackTrace();
                }
                async.complete();

            }, 100, TimeUnit.MILLISECONDS);
        });

        app.get("/delay", ctx -> {
            Thread.sleep(100);
            ctx.result("ok");
        });

        app.get("/no-delay", ctx -> {
            ctx.result("ok");
        });
    }
}

And got the following results:

➜  ~ wrk2 -t16 -c300 -d5s -R3000 http://localhost:8080/delay
Running 5s test @ http://localhost:8080/delay
  16 threads and 300 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   331.36ms  138.72ms 626.18ms   57.34%
    Req/Sec        nan       nan   0.00      0.00%
  10854 requests in 5.00s, 1.24MB read
  Socket errors: connect 53, read 0, write 0, timeout 106
Requests/sec:   2170.40
Transfer/sec:    254.34KB
➜  ~ wrk2 -t16 -c300 -d5s -R3000 http://localhost:8080/async-delay
Running 5s test @ http://localhost:8080/async-delay
  16 threads and 300 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   285.84ms  120.75ms 522.50ms   56.27%
    Req/Sec        nan       nan   0.00      0.00%
  11060 requests in 6.10s, 1.29MB read
  Socket errors: connect 53, read 0, write 0, timeout 124
Requests/sec:   1814.16
Transfer/sec:    216.14KB
➜  ~ wrk2 -t16 -c16 -d5s -R70000 http://localhost:8080/no-delay
Running 5s test @ http://localhost:8080/no-delay
  16 threads and 16 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.51ms    3.12ms  21.95ms   88.36%
    Req/Sec        nan       nan   0.00      0.00%
  349824 requests in 5.00s, 40.03MB read
Requests/sec:  69995.44
Transfer/sec:      8.01MB

Solution

  • Since Jetty 9+ is 100% async from the get go, this lack of difference makes sense. (In fact, in Jetty 9+ there is extra work done to pretend to be synchronous when using synchronous APIs like InputStream.read() or OutputStream.write())

    Also, your load testing workloads are not realistic.

    • You want many more client machines to do the testing with. There is no single software client alone is capable of stressing a Jetty server. You'll hit system resource limits well before you hit any kind of Jetty serving limits.
      • At least a 4 to 1 ratio (we test with a 8 to 1 ratio) of client machines to server machines to generating enough load to stress Jetty.
    • You want many concurrent connections to the server. (think 40,000+)
      • Or you want HTTP/2 in the picture (which also stresses out the server resources in it's own unique ways)
    • You want lots of data returned (something that would take multiple network buffers to return).
    • You want to sprinkle in a few client connections that are slow to read as well (which on a synchronous server can impact the rest of the connections that are not slow by simply consuming too many resources)