Search code examples
clinux-kerneltcpclient

TCP send/receive packet timeout in Linux


I’m using raspberry pi b+ and building tcp server/client connection with C. I have few questions from client side.

  1. How long does Linux queue the packets for client? When the packet has received thru Linux, what if client is not ready to process it or select/epoll func inside loop has 1min sleep? If there is a timeout, is there a way to adjust the timeout with code/script?
  2. What is the internal process inside of Linux when it receives the packet? (i.e., ethernet port->kernel->ram->application??)

Solution

  • The raspberry pi (with linux) and any known linux (or nonlinux) tcp/ip works in some way like this:

    • You have a kernel buffer in which the kernel stores all the data from the other side, this is the data that has not yet been read by the user process. the kernel normally has all this data acknowledged to the other side (the acknowledge states the last byte received and stored in that buffer) The sender side has also a buffer, where it stores all the sent data that has not yet been acknowledged by the receiver (This data must be resent in case of timeout) plus data that is not yet in the window admitted by the receiver. If this buffer fills, the sender is blocked, or a partial write is reported (depending on options) to the user process.
    • That kernel buffer (the reading buffer) allows the kernel to make the data available for reading to the user process while the process is not reading the data. If the user process cannot read it, it remains there until de process does a read() system call.
    • The amount of buffering that the kernel is still capable of reading (known as the window size) is sent to the other end on each acknowledge, so the sender knows the maximum amount of data it is authorized to send. When the buffer is full, the window size descends to zero and the receiver announces it cannot receive more data. This allows a slow receiver to stop a fast sender from filling the network with data that cannot be sent.
    • From then on (the situation with a zero window), the sender periodically (or randomly) sends a segment with no data at all (or with just one byte of data, depending on the implementation) to check if some window has open to allow it to send more data. The acknowledge to that packet will allow it to start communicating again.
    • Everything is stopped now, but no timeout happens. both tcps continue talking this way until some window is available (meaning the receiver has read() part of the buffer)

    This situation can be mainained for days without any problem, the reading process is busy and cannot read the data, and the writing process is blocked in the write call until the kernel in the sending side has buffer to accomodate the data to be written.

    When the reading process reads the data:

    • An ack of the last sent byte is sent, announcing a new window size, larger than zero (by the amount freed by the reader process when reading)
    • The sender receives this acknowledge and sends that amount of data from his buffer, if this allows to accomodate the data the writer has requested to write, it will be awaken and allowed to continue sending data.
    • Again, timeouts normally only occur if data is lost in transit.

    But...

    If you are behind a NAT device, your connection data can be lost from not exercising it (the nat device maintains a cache of used address/port local devices making connections to the outside) and on the next data transfer that comes from the remote device, the nat device can (or cannot) send a RST, because the packet refers to a connection that is not known to it (the cache entry expired)

    Or if the packet comes from the internal device, the connection can be recached and continue, what happens, depends on who is the first to send a packet.

    Nothing specifies that an implementation should provide a timeout for data to be sent, but some implementations do, aborting the connection with an error in case some data is timeout for a large amount of time. TCP specifies no timeout in this case, so it is the process resposibility to cope with it.

    TCP is specified in RFC-793 and must be obeyed by all implementations if they want communications to succeed. You can read it if you like. I think you'll get a better explanation than the one I give you here.

    So, to answer your first question: The kernel will store the data in its buffer as long as your process wants to wait for it. By default, you just call write() on a socket, and the kernel tries as long as you (the user) don't decide to stop the process and abort the operation. In that case the kernel will probably try to close the connection or reset it. The resources are surrogated to the life of the process, so as long as the process is alive and holding the connection, the kernel will wait for it.