After asking about the same problem on the Linux forum and some bug testing I have the following information that seems relevant to my problem:
I am building an application that diverts incoming packets a kernel network hook to a proxy in userspace, that reads the data from a tcp socket and then sends it to its original destination. When the packet enters I change the skb destination addresses to those of my proxy tcp server, and when it leaves I change the source addresses so that the communication will go through transparently.
I have encountered the following problem:
When large amounts of data enter, they reach the proxy with no problem.
However, when sending the data to its original destination, if I send a large enough amount of data the system hangs. Caveman debugging showed that the skb
is only non linear when it leaves the proxy, and without calling skb_linearize
the checksum is not computed successfully.
I do not allocate any data myself in the kernel when the data exits and don't seem to have memory errors in my own code, so I have concluded that with high probability the problem is with my usage skb_linearize
function, or how I compute the checksum in general:
void fixChecksum(struct sk_buff *skb)
{
if(skb_is_nonlinear(skb))
{
skb_linearize(skb);
}
struct iphdr *ip_header = ip_hdr(skb);
struct tcphdr *tcp_header = (struct tcphdr*)(skb_network_header(skb) + ip_hdrlen(skb));
int tcplen = (skb->len - ((ip_header->ihl )<< 2));
tcp_header->check=0;
tcp_header->check = tcp_v4_check(tcplen, ip_header->saddr, ip_header->daddr,csum_partial((char*)tcp_header, tcplen,0));
skb->ip_summed = CHECKSUM_NONE; //stop offloading
ip_header->check = 0;
ip_header->check = ip_fast_csum((u8 *)ip_header, ip_header->ihl);
}
My suspicion is that the data I transfer somehow stays in the kernel and after a large enough amount the kernel runs out and the system hangs.
However I don't see what may be wrong here.
I also tried changing skb_linearize
to skb_linearize_cow
which did not help.
Is it possible that the skbs I process on the LOCAL_OUT
hook don't get freed after I process them?
My kernel version is 3.2
So after a bit more debugging and checking kernel memory usage while my program runs it turned out that it didn't grow with each packet. It stopped locking up after I removed spinlocks in one of my functions that I added because I wanted to keep my module functions thread safe between system calls. Apparently I didn't know what I was doing and the heavy load probably caused a deadlock or something, which is strange because I only have one lock. Thanks if anyone tried to help and sorry for wasting your time.