Search code examples
cnetwork-programmingtcplinux-kernelkernel-module

skb checksum computing function probably causes system hang


After asking about the same problem on the Linux forum and some bug testing I have the following information that seems relevant to my problem:

I am building an application that diverts incoming packets a kernel network hook to a proxy in userspace, that reads the data from a tcp socket and then sends it to its original destination. When the packet enters I change the skb destination addresses to those of my proxy tcp server, and when it leaves I change the source addresses so that the communication will go through transparently.

I have encountered the following problem:

When large amounts of data enter, they reach the proxy with no problem.

However, when sending the data to its original destination, if I send a large enough amount of data the system hangs. Caveman debugging showed that the skb is only non linear when it leaves the proxy, and without calling skb_linearize the checksum is not computed successfully.

I do not allocate any data myself in the kernel when the data exits and don't seem to have memory errors in my own code, so I have concluded that with high probability the problem is with my usage skb_linearize function, or how I compute the checksum in general:

void fixChecksum(struct sk_buff *skb)
{
    if(skb_is_nonlinear(skb))
    {
        skb_linearize(skb);
    }
    struct iphdr *ip_header = ip_hdr(skb);
    struct tcphdr *tcp_header = (struct tcphdr*)(skb_network_header(skb) + ip_hdrlen(skb));
    int tcplen = (skb->len - ((ip_header->ihl )<< 2));
    tcp_header->check=0;
    tcp_header->check = tcp_v4_check(tcplen, ip_header->saddr, ip_header->daddr,csum_partial((char*)tcp_header, tcplen,0));
    skb->ip_summed = CHECKSUM_NONE; //stop offloading
    ip_header->check = 0;
    ip_header->check = ip_fast_csum((u8 *)ip_header, ip_header->ihl);
}

My suspicion is that the data I transfer somehow stays in the kernel and after a large enough amount the kernel runs out and the system hangs. However I don't see what may be wrong here. I also tried changing skb_linearize to skb_linearize_cow which did not help.

Is it possible that the skbs I process on the LOCAL_OUT hook don't get freed after I process them?

My kernel version is 3.2


Solution

  • So after a bit more debugging and checking kernel memory usage while my program runs it turned out that it didn't grow with each packet. It stopped locking up after I removed spinlocks in one of my functions that I added because I wanted to keep my module functions thread safe between system calls. Apparently I didn't know what I was doing and the heavy load probably caused a deadlock or something, which is strange because I only have one lock. Thanks if anyone tried to help and sorry for wasting your time.