Search code examples
linuxassemblylinux-kernelkernel-modulesystem-calls

Intercepting syscalls (where are args passed)


I'm doing a kernel module that intercepts kernel syscalls. Intercepting, or rather just replacing the real syscall address with a fake syscall address in plain C is as easy as 1-2-3. But I'd like to know how that works on low level.

(let's pretend I'm on x86)

First of all, I'm doing just a basic test: I'm kallocating a small chunk of executable memory and filling it with this opcode:

0xB8, 0x00, 0x00, 0x00, 0x00,          //mov eax, &real_syscall_function;
0xFF, 0xE0,                            //jmp eax;

Inserting the module and replacing the syscall works just perfect.

Now, according to this SO answer, arguments are passed in the registers. I want to check this, so I create an executable chunk of memory and fill it with this code:

0x55,                                  //push ebp;
0x89, 0xE5,                            //mov ebp, esp;
0x83, 0xEC, 0x20,                      //sub esp, 32; 

0xB8, 0x00, 0x00, 0x00, 0x00,          //mov eax, &real_syscall_function;
0xFF, 0xE0,                            //jmp eax;

0x89, 0xEC,                            //mov esp, ebp;
0x5D,                                  //pop ebp;
0xC3                                   //ret;

This should work too, as I'm not touching any of the registers, I'm just playing with the stack, but it doesn't work. That makes me think arguments are actually passed on the stack. But why? Am I understand the SO answer I linked to wrong? Aren't args supposed to be in the registers when a syscall is called?

Extra question: Why using jmp eax works, but call eax doesn't work? (This applies to both first and second example code).

Edit: I'm sorry, I missed a little bit the comments in the ASM code. What I'm jmping to is the address of the real syscall function.

Edit 2: I think it's obvious, but anyways I'll explain it just in case somebody is not understanding what I'm doing. I'm allocating a small executable chunk of memory, filling it with the opcode I'm showing and then making a given syscall (let's say __NR_read) point to the address of that executable chunk of memory.


works just perfect == system keeps running without problems. It means the real syscall is being called from the fake syscall

it doesn't work == system crashes because the fake syscall isn't calling the real syscall


Solution

  • Syscall params are first passed from userspace via registers to system_call() function which is in essence a common syscall dispatcher. However system_call() then calls real system call functions such as sys_read() in a usual manner, passing parameters via stack. Therefore, messing up with the stack leads to crash. Also, see this SO answer: https://stackoverflow.com/a/10459713 and very detailed explanation on quora: http://www.quora.com/Linux-Kernel/What-does-asmlinkage-mean-in-the-definition-of-system-calls#step=6 (registration required).