Search code examples

How webpush work in TCP/IP network layers

Please explain to me how webpush work in TCP/IP network layers (especially layer 4-5).

I understand that HTTP is stateless protocol:

  • the protocol is opening TCP / layer 4 connection,
  • 'state' is 'made to work' with cookie/session,
  • then client send HTTP request (plaintext/compressed "HTTP/1.1 /url/here ... Content-Length: ..."),
  • then server respond with HTTP request (plaintext/compressed "200 OK ... ..."),

Therefore it's understandable that for a user behind NAT to be able to view webpage of a remote host (because the user behind NAT is the one initiating the connection); but the webserver cannot initiate TCP connection with the client (browser process).

However there are some exceptions like 'websocket' where client (browser) initiate a connection, then leave it open (elevate to just TCP, not HTTP anymore). In this architecture, webserver may send / initiate sending message to client (for example "you have new chat message" notification).

What I don't understand is the new term 'webpush'.

  • I observed that it can send notification from server to client/browser (from user, it 'feels' like the server is the one initiating the connection)
  • webpush can send notification anytime, even when browser is closed / not opened yet (as when the device was just freshly turned on), or when it's just connected to internet

How does it work? How do they accomplish this? Previously I think that:

  • either a javascript in a page is continously (ex: 5 second interval) checking if there's a new notification in server,
  • or a javascript initiate a websocket (browser initiate/open TCP connection) and keep it alive, when server need to send something, it's sent from webserver to client/browser through this connection

Is this correct? Or am I missing something? Since both of my guess above won't work behind NAT'd network

Is Firebase web notification also this kind of webpush?

I have searched the internet for explanation on what make it work on client side, but there seems only explanation on 'how to send webpush', 'how to market your product with webpush', those articles only explain the server side (communication of app server with push service server) or articles about marketing.

Also, I'm interested in understanding what application layer protocol they're running on (as in what text/binary data the client/server send to each other), if it's not HTTP


  • Web Push works because there is a persistent connection between the browser (e.g. Chrome) and the browser push service (e.g. FCM).

    When your application server needs to send a notification to a browser, it cannot reach the browser directly with a connection, instead it contacts the browser push service (e.g. FCM for Chrome) and then it's the browser push service that delivers the notification to the user browser.

    This is possible because the browser constantly tries to keep an open connection with the server (e.g. FCM for Chrome). This means that there isn't any problem for NAT, since it's the clients that starts the connection. Also consider that any TCP connection is bi-directional: so any side of the connection can start sending data at any time. Don't confuse higher level protocols like HTTP with a normal TCP connection.

    If you want more details I have written this article that explains in simple words how Web Push works. You can also read the standards: Push API and IETF Web Push in particular.

    Note: Firebase (FCM) is two different things, even if that is not clear from the documentation. It is both the browser push service required to deliver notifications to Chrome (like Mozilla autopush for Firefox, Windows Push Notification Services for Edge and Apple Push Notification service for Safari), but it is also a proprietary service with additional features to send notifications to any browser (like Pushpad, Onesignal and many others).