Search code examples
clinuxsocketsrecvmsgsendmsg

iov and msg_control in sendmsg and recvmsg


What is the difference between iov.iov_base and msg.msg_control ? I'm looking at some code examples (ipuitls open source ping) When sending data using sendmsg the packet is set in iov.iov_base When reading data using recvmsg the packet is read from msg->msg_control directly.

What is the relationship between struct iovec and struct msghdr ? Is there a difference when reading/sending data ?

Sorry for the silly question. I didn't find an answer so far and I'm confused. thanks !


Solution

  • Ancillary data or control messages (.msg_controllen bytes at .msg_control) is data provided or verified by the kernel, whereas the normal payload (in iovecs) is just data received from the other endpoint, unverified and unchecked by the kernel (except for checksum, if the protocol has one).

    For IP sockets (see man 7 ip), there are several socket options that cause the kernel to provide ancillary data on received messages. For example:

    • IP_RECVORIGDSTADDR socket option tells the kernel to provide a IP_ORIGDSTADDR type ancillary message (with a struct sockaddr_in as data), identifying the original destination address of the datagram received

    • IP_RECVOPTS socket option tells the kernel to provide a IP_OPTIONS type ancillary message containing all IP option headers (up to 40 bytes for IPv4) for incoming datagrams

    Ping and traceroute uses ICMP messages over IP; see man 7 icmp (and man 7 raw) for details.

    Because most ICMP responses do not contain useful data filled in by the sender, the iovecs don't usually contain anything interesting. Instead, the interesting data is in the IP message headers and options.

    For example, an ICMP Echo reply packets contain just 8 bytes (64 bits): 8-bit type (0), 8-bit code (0), 16-bit checksum, 16-bit id, and 16-bit sequence number. To get the IP headers with the interesting fields, you need the kernel to provide them as ancillary data control messages.


    The background:

    As described in the sendmsg() and related man pages, we have

    ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags);
    
    struct msghdr {
        void         *msg_name;       /* Optional address */
        socklen_t     msg_namelen;    /* Size of address */
        struct iovec *msg_iov;        /* Scatter/gather array */
        size_t        msg_iovlen;     /* # elements in msg_iov */
        void         *msg_control;    /* Ancillary data */
        size_t        msg_controllen; /* Ancillary data buffer len */
        int           msg_flags;      /* Flags (unused) */
    };
    
    struct iovec {
        void         *iov_base;       /* Starting address */
        size_t        iov_len;        /* Number of bytes to transfer */
    };
    

    with man 3 cmsg describing how to construct and access such ancillary data,

    struct cmsghdr {
        size_t        cmsg_len;    /* Data byte count, including header
                                      (type is socklen_t in POSIX) */
        int           cmsg_level;  /* Originating protocol */
        int           cmsg_type;   /* Protocol-specific type */
        unsigned char cmsg_data[]; /* Data itself */
    };
    
    struct cmsghdr *CMSG_FIRSTHDR(struct msghdr *msgh);
    struct cmsghdr *CMSG_NXTHDR(struct msghdr *msgh, struct cmsghdr *cmsg);
    size_t          CMSG_ALIGN(size_t length);
    size_t          CMSG_SPACE(size_t length);
    size_t          CMSG_LEN(size_t length);
    unsigned char  *CMSG_DATA(struct cmsghdr *cmsg);
    

    These ancillary data messages are always sufficiently aligned for the current architecture (so that the data items can be directly accessed), so to construct a proper ancillary message (SCM_CREDENTIALS to pass user, group, and process ID information over an Unix domain socket, or SCM_RIGHTS to pass file descriptors), these macros have to be used. The man 3 cmsg man page contains example code for these.

    Suffice it to say, that to loop over each ancillary data part in a given message (struct msghdr msg), you use something that boils down to

    char *const  end = (char *)msg.msg_control + msg.msg_controllen;
    char        *ptr = (char *)msg.msg_control;
    
    for (char *ptr = (char *)msg.msg_control;  ptr < end;
               ptr += ((struct cmsghdr *)ptr)->cmsg_len) {
        struct cmsghdr *const cmsg = (struct cmsghdr *)ptr;
    
        /* level is cmsg->cmsg_level and type is cmsg->cmsg_type, and
           cmsg->cmsg_data is sufficiently aligned for the level and type,
           so you can use ((datatype *)(cmsg->cmsg_data)) to obtain a pointer
           to the type corresponding to this level and type ancillary payload.
           The exact size of the payload is
               (cmsg->cmsg_len - sizeof (struct cmsghdr))
           so e.g. an SCM_RIGHTS ancillary message, with
               cmsg->cmsg_level == SOL_SOCKET && cmsg->cmsg_type == SCM_RIGHTS
           has exactly
               (cmsg->cmsg_len - sizeof (struct cmsghrd)) / sizeof (int)
           new file descriptors as a payload.
        */
    }