Search code examples
jsonspring-bootresthttpgzip

HTTP gzip compression does not save any time


I have a Spring Boot application that sometimes has to serve a very big JSON payload (several MB) over a REST API, which takes considerable time to download.

The data is read from a DB, serialized into JSON and sent back to the client. The DB read operation is fast, even for big datasets, usually below 1 second. So my conclusion was that the most time consuming part is the HTTP exchange.

I've enabled GZIP compression for the HTTP exchange so the payload should be compressed before being sent. It seems this works (the returned payload is indeed compressed), however, there is no noticeable performance gain.

A curl request to the application's endpoint without compression takes 49 sec and yields a ~10 MB JSON payload:

curl -H "Content-Type: application/json" -H "Accept: application/json" -H "Authorization: Basic <REDACTED>" --data-binary @priorities-request.json  'https://<REDACTED>/api/rest/priorities' > priorities-response.json                
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10.0M    0  9.9M  100 85081   205k   1715  0:00:49  0:00:49 --:--:--  239k

With GZIP compression enabled, the same request takes 42 sec and yields a ~260 kb GZip compressed JSON payload:

curl -H "Content-Type: application/json" -H "Accept: application/json" -H "Accept-Encoding: gzip,deflate,br" -H "Authorization: Basic <REDACTED>" --data-binary @priorities-request.json  'https://<REDACTED>/api/rest/priorities' > priorities-response.json 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  259k    0  176k  100 85081   4221   1991  0:00:42  0:00:42 --:--:-- 14408

My expectation would be that downloading a compressed 260K payload would take considerably less time than an uncompressed 10 MB download.

What's my mistake?

Edit: Because it's been asked in the comments how I set up the GZIP compression: I set compression="on" and compressableMimeType="application/json" in the server.xml of Tomcat. That's it. The rest is done by Tomcat's org.apache.coyote.http11.filters.GzipOutputFilter class.

Edit 2: To rule out that serializing the data into JSON is where the time is lost, I tested locally with Jackson2JsonMessageConverter, but it took only about 0.5 seconds to write even a huge data structure into a 10MB JSON string.

Edit 3: What I find most puzzling is that the client application that consumes the API, which is running on another Tomcat instance on the same physical machine, still experiences the same delay when retrieving the data.


Solution

  • We figured it out: It turned out the HTTP exchange didn't have anything to do with it.

    The bottleneck was, in fact the database, but we didn't notice at first because the JPA query returns almost immediately.

    What we weren't seeing was that lots of properties in the retrieved objects are loaded "lazily", so the DB queries for them are only executed when the JSON serializer accesses these properties. Those queries didn't use any CPU time on the Tomcat machine, so we couldn't detect the time loss by profiling.