Search code examples
node.jssocketstcp

Acessing TCP protocol data on net.Socket()


I'm trying to do an analysis of MariaDB/MySQL packets through Node. I've gotten pretty much everything, however, when a packet is larger than ~64K it's split into smaller packets. This means that the socket "data" event will be called several times, however, it is not possible to know when the data finished being produced by the server (unless I parse the incoming payload and it's producing a valid packet, which is quite counterproductive).

Reading more about it, I realized that the TCP protocol itself has flags that can help through the ACK and PSH flags (which, as far as I understood, would be the last packet sent).

So, if I receive 1MB from server, it means that my package will be splitted into 16 packets of 64KB each, and the first 15 will only have the ACK flag and the last ACK & PSH.

How can I get this information in net.Socket()?

It seems to me that Node does not allow directly accessing information from the TCP protocol itself, and properties like bytesReaded, readableLength or readableHighWatermark don't have information that helps me.

This is my current code:

  private send(write: Buffer, onData: OnDataCallback) {
    const buffers: Buffer[] = [];
    const rebuild: OnDataCallback = (partial) => {
      // Here we read() from Socket the SQL command response from Server:
      buffers.push(partial);

      // Event must be off() when all data reach: if(...) { ... }
      this.socket.off("data", rebuild);
      // Then complete data is concat() and submitted to callback:
      onData(Buffer.concat(buffers));
    };

    this.socket.on("data", rebuild);

    // Here we write() to Socket a SQL command:
    this.socket.write(write);
  }

Solution

  • You can't get this information in net.Socket on NodeJS (or any other user-level application). The flags on the TCP packets are handled by the kernel which re-assembled the data stream for the consuming application (your node app). The details of the tcp packets, flags, sequence numbers, ... are all handled by the kernel only and can not be accessed by a user-level application.

    All you see from there is that some new data is available, but you will have no reliable information about when it arrived, in which order it arrived or if there were e.g. missing packages which were retransmitted.

    To access the packet-level data, you need to use something lower in the stack, commonly a tool called tcpdump, to get the raw packet details from the kernel.

    Also, the PSH flag is an optimization used by MariaDB/MySQL to ensure that data sent in the last package of a response is delivered quicker by the receiving kernel to the client application, instead of potentially waiting for more packets to arrive. You should likely not rely on this flag's presence to detect when a reply was completely send. For that, you actually have to parse the MySQL protocol.