Is it a good practice to write a GB of byes into a TCP socket in one go?

I am maintaining some matured production code which sends data over TCP sockets. It always breaks large chunk of data into many packets, each 1000 bytes. I just wonder why it was done this way. Why can't I just write a GB worth of a byte array into the socket in one go? What are the cons to do that?

Solution

There are many reasons not to throw a huge chunk in at once.

First of all: Even on very fast networks sending a GB of data will take a non-trivial amount of time. On a 10Gbps network it would take a little under 1 second, which is a long time in computer speak. And that assumes that this one operation has all the bandwidth of the network available to it and doesn't have to share with anything else.

This means that if you successfully do a 1GB write call to a TCP socket, it will be some time until the later bits of data are actually sent.

And in that mean time you'll have to hold all that data in memory. That means that you'll need to allocate and hold on to 1GB of data for that whole transaction.

If instead you fill a small-ish buffer and read from your source (or generate, depending on where the data comes from) before each write, then you'll need only a little memory (the size of the buffer).

All of that might not sound like a big deal with todays machines, but consider that many servers will serve hundreds of clients/requests at once and if each one requires a 1GB buffer, then that can grow out of hand quickly.

Is 1000 a good size for that buffer? I'm no networking expert, but I suspect that's a little low. Maybe something on the order of 64k would be appropriate, but others can give better details here. Finding a good buffer size can sometimes be a bit tricky.