Search code examples
cachingreverse-proxyvarnishserver-side-includesedge-side-includes

Varnish and ESI, how is the performance?


Im wondering how the performance of th ESI module is nowadays? I've read some posts on the web that ESI performance on varnish were actually slower than the real thing.

Say i had a page with over 3500 esi includes, how would this perform? is esi designed for such usage?


Solution

  • We're using Varnish and ESI to embed sub-documents into JSON documents. Basically a response from our app-server looks like this:

    [
      <esi:include src="/station/best_of_80s" />,
      <esi:include src="/station/herrmerktradio" />,
      <esi:include src="/station/bluesclub" />,
      <esi:include src="/station/jazzloft" />,
      <esi:include src="/station/jahfari" />,
      <esi:include src="/station/maximix" />,
      <esi:include src="/station/ondalatina" />,
      <esi:include src="/station/deepgroove" />,
      <esi:include src="/station/germanyfm" />,
      <esi:include src="/station/alternativeworld" />
    ]
    

    The included resources are complete and valid JSON responses on their own. The complete list of all stations is about 1070. So when the cache is cold and a complete station list is the first request varnish issues 1000 requests on our backend. When the cache is hot ab looks like this:

    $ ab -c 100 -n 1000 http://127.0.0.1/stations
    [...]
    
    Document Path:          /stations
    Document Length:        2207910 bytes
    
    Concurrency Level:      100
    Time taken for tests:   10.075 seconds
    Complete requests:      1000
    Failed requests:        0
    Write errors:           0
    Total transferred:      2208412000 bytes
    HTML transferred:       2207910000 bytes
    Requests per second:    99.26 [#/sec] (mean)
    Time per request:       1007.470 [ms] (mean)
    Time per request:       10.075 [ms] (mean, across all concurrent requests)
    Transfer rate:          214066.18 [Kbytes/sec] received
    
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:        1   11   7.3      9      37
    Processing:   466  971  97.4    951    1226
    Waiting:        0   20  16.6     12      86
    Total:        471  982  98.0    960    1230
    
    Percentage of the requests served within a certain time (ms)
      50%    960
      66%    985
      75%    986
      80%    988
      90%   1141
      95%   1163
      98%   1221
      99%   1229
     100%   1230 (longest request)
    $ 
    

    100 rec/sec doesn't look that good but consider the size of the document. 214066Kbytes/sec oversaturates a 1Gbit interface well.

    A single request with warm cache ab (ab -c 1 -n 1 ...) shows 83ms/req.

    The backend itself is redis based. We're measuring a mean response time of 0.9ms [sic] in NewRelic. After restarting Varnish the first request with a cold cache (ab -c 1 -n 1 ...) shows 3158ms/rec. This means it takes Varnish and our backend about 3ms per ESI include when generating the response. This is a standard core i7 pizza box with 8 cores. I measured this while being under full load. We're serving about 150mio req/month this way with a hitrate of 0.9. These numbers suggest indeed that the ESI-includes are resolved in serial.

    What you have to consider when designing a system like this is 1) that your backend is able to take the load after a Varnish restart when the cache is cold and 2) that usually your resources don't expire all at once. In case of our stations they expire every full hour but we're adding a random value of up to 120 seconds to the expiration header.

    Hope that helps.