Search code examples
csocketsnetworkinghttp-proxypackets

How to see TCP, IP headers in my HTTP proxy?


I have a forking HTTP proxy implemented on my Ubuntu 14.04 x86_64 with the following scheme (I'm reporting the essential code and pseudocode just to show the concept):

  1. socketClient = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
  2. bind(socketClient,(struct sockaddr*)&addr, sizeof(addr));
  3. listen(socketClient, 50);
  4. newSocket = accept(socketClient, (struct sockaddr*)&cliAddr, sizeof(cliAddr));
  5. get request from client, parse it to resolve the requested hostname in an IP address;
  6. fork(), open connection to remote server and deal the request;
  7. child process: if it is a GET request, send original request to server and while server is sending data, send data from server to client;
  8. child process: else if it is a CONNECT request, send string 200 ok to client and poll both client socket descriptor and server socket descriptor with select(); if I read data from server socket, send this data to client; else if I read data from client socket, send this data to server.

The good thing is that this proxy works, the bad thing is that now I must collect statistics; this is bad because I'm working on a level where I can't get the data I'm interested in. I don't care about the payload, I just need to check in IP and TCP headers the flags I care about.

For example, I'm interested in:

  • connection tracking;
  • number of packets sent and received.

As for the first, I would check in the TCP header the SYN flag, SYN/ACK and then a last ACK; as for the second, I would just do +1 to a counter of mine every time a char buffer[1500] is filled with data when I send() or recv() a full packet.

I realized that this is not correct: SOCK_STREAM doesn't have the concept of packet, it is just a continuous stream of bytes! The char buffer[1500] I use at point 7. and 8. has useful statistic, I may set its capacity to 4096 bytes and yet I couldn't keep track of the TCP packets sent or received, because TCP has segments, not packets.

I couldn't parse the char buffer[] looking for SYN flag in TCP header either, because IP and TCP headers are stripped from the header (because of the level I'm working on, specified with IPPROTO_TCP flag) and, if I understood well, the char buffer[] contains only the payload, useless to me.

So, if I'm working on a too high level, I should go lower: once I saw a simple raw socket sniffer where an unsigned char buffer[65535] was cast to struct ethhdr, iphdt, tcphdr and it could see all the flags of all the headers, all the stats I'm interested in!

After the joy, the disappointment: since raw sockets work on a low level they don't have some concepts vital to my proxy; raw sockets can't bind, listen and accept; my proxy is listening on a fixed port, but raw sockets don't know what a port is, it belongs to the TCP level and they bind to a specified interface with setsockopt.

So, if I'd socket(PF_INET, SOCK_RAW, ntohs(ETH_P_ALL)) I should be able to parse the buffer where I recv() and send() at .7 and .8, but I should use recvfrom() and sendto()...but all this sounds quite messy, and it envolves a nice refactoring of my code.

How can I keep intact the structure of my proxy (bind, listen, accept to a fixed port and interface) and increase my line of vision for IP and TCP headers?


Solution

  • My suggestion is to open a raw socket in, for example, another thread of your application. Sniff all traffic and filter out the relevant packets by addresses and port numbers. Basically you want to implement your own packet sniffer:

    int sniff()
    {
        int sockfd;
        int len;
        int saddr_size;
        struct sockaddr saddr;
        unsigned char buffer[65536];
    
        sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_TCP);
        if (sockfd < 0) {
            perror("socket");
            return -1;
        }
        while (1) {
            saddr_size = sizeof(saddr);
            len = recvfrom(sockfd, buffer, sizeof(buffer), 0, &saddr, &saddr_size);
            if (len < 0) {
                perror("recvfrom");
                close(sockfd);
                return -1;
            }
    
            // ... do the things you want to do with the packet received here ...
        }
        close(sockfd);
        return 0;
    }
    

    You can also bind that raw socket to a specific interface if you know which interface is going to be used for the proxy's traffic. For example, to bind to "eth0":

    setsockopt(sockfd, SOL_SOCKET, SO_BINDTODEVICE, "eth0", 4);
    

    Use getpeername() and getsockname() function calls to find the local and remote addresses and port numbers of your TCP connections. You'll want to filter the packets by those.