I have a Spring Boot application that sometimes has to serve a very big JSON payload (several MB) over a REST API, which takes considerable time to download.
The data is read from a DB, serialized into JSON and sent back to the client. The DB read operation is fast, even for big datasets, usually below 1 second. So my conclusion was that the most time consuming part is the HTTP exchange.
I've enabled GZIP compression for the HTTP exchange so the payload should be compressed before being sent. It seems this works (the returned payload is indeed compressed), however, there is no noticeable performance gain.
A curl
request to the application's endpoint without compression takes 49 sec and yields a ~10 MB JSON payload:
curl -H "Content-Type: application/json" -H "Accept: application/json" -H "Authorization: Basic <REDACTED>" --data-binary @priorities-request.json 'https://<REDACTED>/api/rest/priorities' > priorities-response.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10.0M 0 9.9M 100 85081 205k 1715 0:00:49 0:00:49 --:--:-- 239k
With GZIP compression enabled, the same request takes 42 sec and yields a ~260 kb GZip compressed JSON payload:
curl -H "Content-Type: application/json" -H "Accept: application/json" -H "Accept-Encoding: gzip,deflate,br" -H "Authorization: Basic <REDACTED>" --data-binary @priorities-request.json 'https://<REDACTED>/api/rest/priorities' > priorities-response.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 259k 0 176k 100 85081 4221 1991 0:00:42 0:00:42 --:--:-- 14408
My expectation would be that downloading a compressed 260K payload would take considerably less time than an uncompressed 10 MB download.
What's my mistake?
Edit: Because it's been asked in the comments how I set up the GZIP compression: I set compression="on"
and compressableMimeType="application/json"
in the server.xml
of Tomcat. That's it. The rest is done by Tomcat's org.apache.coyote.http11.filters.GzipOutputFilter class.
Edit 2: To rule out that serializing the data into JSON is where the time is lost, I tested locally with Jackson2JsonMessageConverter, but it took only about 0.5 seconds to write even a huge data structure into a 10MB JSON string.
Edit 3: What I find most puzzling is that the client application that consumes the API, which is running on another Tomcat instance on the same physical machine, still experiences the same delay when retrieving the data.
We figured it out: It turned out the HTTP exchange didn't have anything to do with it.
The bottleneck was, in fact the database, but we didn't notice at first because the JPA query returns almost immediately.
What we weren't seeing was that lots of properties in the retrieved objects are loaded "lazily", so the DB queries for them are only executed when the JSON serializer accesses these properties. Those queries didn't use any CPU time on the Tomcat machine, so we couldn't detect the time loss by profiling.