Search code examples
google-chromesocketsgoogle-chrome-devtoolshttp-1.1stalled

Zombie SSE connections not reported in Chrome Dev Tools


Background

I'm building a web application for internal use with HTTP/1.1. It will not have TLS, so HTTP/2 is not supported, nor is HTTP/3 or QUIC (these new protocols would solve the problem, but are currently not accessible).

The client uses:

  • some static resources
  • some API endpoints (all very fast, all using Fetch API)
  • 2 concurrent and long running server-sent event streams

All requests are served by nginx, but it looks like my issue is not related to it.

Issue

Sometimes, after refreshing the page, some requests stall for many seconds. The requests are stalled by Chrome, i.e. Chrome decides to not send the HTTP request as soon as the request is initiated by navigation or JS Fetch API. The DevTools gives no hint to understand why that happens.

Chrome Dev Tools screenshot

I tried to exclude some common issues:

  • higher priority requests: it's a Fetch, so it should not be considered as a low priority request; also, there are no other requests that get prioritized.
  • server worker thereads: I don't see any error message but it was cleat that the request was not departing from Chrome.
  • QUIC: I'm running on HTTP/1.1, not HTTP/2, not HTTP/3, not QUIC.
  • disk space allocation for caching: I tried specifying {cache: 'no-store'} in the Fetch, with no luck.
  • Chrome's maximum six TCP connection per origin rule: This was actually my first check. The SSE connections displayed in the Dev Tools are never more than 2; the stalled request do not appear to start after another request has finished. I didn't see any other pending request that may be holding a TCP socket. No other tabs open. Spoiler: this limitation was actually engaged.

After some checks, I spent some time configuring nginx to always send Connection: close as HTTP header in all responses, so that the client does not try to reuse the same TCP connection for the upcoming requests. I thought that Chrome was optimistically stalling to reuse the TCP connection of the SSE requests assuming it would free in a short time. That was effective to prevent connection reuse, but that was not the cause of the problem -- There's no such optimistic stalling in Chrome, for good reasons.

Enters Wireshark

The six connections limit was actually being reached by concurrent SSE connections.

The point is: these SSE connections were not used by any tab, were not displayed in the Dev Tools and the underlying TCP were left open. In fact, Wireshark allows me to see the TCP 4-Way Termination Handshake:

  • It starts very late, after a lot of time that the SSE was abandoned by refreshing the page
  • It happens just moments before the stalled request departs.
  • It is initiated by Chrome

More precisely, the sequence of captured TCP packets is (see screenshot below):

  • (stream 1) chrome > server: FIN, ACK (the zombie SSE connection)
  • (stream 1) server > chrome: ACK
  • (stream 13) chrome > server: SYN
  • (stream 13) server > chrome: SYN, ACK
  • (stream 13) chrome > server: ACK
  • (stream 13) chrome > server: GET ... (the request that was stalled for a lot of time)

The above events happen within 1.2 milliseconds, so it seems clear to me that Chrome decides to stall the request of stream 13 untill after the client>server side of the connection of stream 1 is closed. Upon further examination I can also confirm that closing the stream 1 did actually free the sixth connection that was open at the moment (the number of open TCP connections decreased from 6 to 5, allowing to open a new connection for the stalled request). (I'm omitting the following ACKs and full termination of stream 13 after the above packets).

Wireshark screenshot

Questions

Why is the SSE connection still open?

How can I avoid zombie SSE connections?

Why are the zombie connections not reported in the Dev Tools?


Solution

  • I reported a chromium bug here: https://issues.chromium.org/issues/358538891


    How can I avoid zombie SSE connections?

    Solution: explicitly close Server-Sent Event sources before unload:

    const sse = new EventSource(...)
    const handler = () => {
        sse.close()
    }
    window.addEventListener('beforeunload', handler)
    window.addEventListener('unload', handler)
    

    I isolated this as the solution by repeatedly commenting in and out the handler and confirmed that the SSE connections are consistently closed on page reload if and only if I explicitly close them.

    Still, the known limitation of SSE over HTTP/1.1 doesn't seem to me like a good excuse to leave zombie SSE connections after page unload. The DevTools report the SSE request as completed (the bar in the waterfall column does not continue to grow), while it's actually still running detached from the page.