Search code examples
rusttcpfile-descriptor

Why are the file descriptors for TcpListener and the produced TcpStream different?


I am trying to investigate file descriptors and how they interact with tcp connections in Rust, I am quite new to both Rust programming and Unix.

Ultimately, I want to bind the stdio of some program to a tcp send/receive by matching up their file descriptors. For now, I just want to understand how the file descriptors are allocated, and why they get different values.

I wrote this simple code to experiment:

use std::{
    net::TcpListener,
    os::fd::AsFd
};

fn main() {
    let listener = TcpListener::bind("0.0.0.0:1024").unwrap();
    println!("Listener fd: {:?}", listener.as_fd());
    for stream in listener.incoming() {
        let stream = stream.unwrap();
        println!("Stream fd: {:?}", stream.as_fd());
    }
}

When I run this, I get a first line printed of:

Listener fd: BorrowedFd { fd: 3 }

And when I subsequently send some data to the tcp listener it prints lines as I expect which look like:

Stream fd: BorrowedFd { fd: 4 }

What I want to understand is:

  • Why are the file descriptors different?
  • What does each file descriptor correspond to, i.e. does using 3 correspond to the incoming data of the connection, and 4 to the data I could write back to the stream?

Solution

  • A file descriptor is an opaque handle the kernel exposes towards some functionality. In your case file descriptor #3 is obtained by socket() and bound to a port. That in turn allows you to call accept() on it to obtain further file descriptors (such as your #4) that can be used to communicate with different connected clients.

    File descriptor #4 is your handle to the connected peer. It supports operations like read() and write() (and many others, such as select()). These are the same operations normally provided on file descriptors that correspond to open files, which is how Unix provides polymorphism.

    Reads and writes on the same file descriptor don't interfere with each other simply because they're different operations and the OS is smart enough to distinguish them. Behind number 4 the kernel keeps an entire data structure that contains separate read and write buffers, control flow flags, and much more. It's only in the user space that it's represented as a single number, which you can think of as an index into a table kept in the kernel.