Search code examples
linuxx86-64system-calls32bit-64bitptrace

Why does ptrace show a 32-bit execve system call having EAX = 59, the 64-bit call number? How do 32-bit system calls work on x86-64?


I'm toying ptrace with the code below. I found that the system call number for execve was 59 even when I compiled with the -m32 option. Since I'm using Ubuntu on a 64-bit machine, it could be understandable.

Soon, the question arose: "Do libc32 behave differently on 32-bit machine and 64-bit machine? Are they different?" So I checked what libc32 had in 64-bit. However, the execve system call number for libc was 11, which was identical the execv system call number for 32-bit systems. So where does the magic happen? Thank you in advance.

Here's the code. It's originated from https://www.linuxjournal.com/article/6100

#include <sys/ptrace.h>                                                                                                           
#include <sys/types.h>                                                                                                            
#include <sys/wait.h>                                                                                                             
#include <unistd.h>                                                                                                               
#include <sys/user.h>                                                                                                             
#include <stdio.h>                                                                                                                
                                                                                                                                  
int main()                                                                                                                        
{                                                                                                                                 
        pid_t child;                                                                                                              
        long  orig_eax;                                                                                                           
        child = fork();                                                                                                           
        if (child == 0) {                                                                                                         
                ptrace(PTRACE_TRACEME, 0, NULL, NULL);                                                                            
                execl("/bin/ls", "ls", NULL);                                                                                     
        } else {                                                                                                                  
                wait(NULL);                                                                                                       
                orig_eax = ptrace(PTRACE_PEEKUSER,                                                                                
#ifdef __x86_64__                                                                                                                 
                                child, &((struct user_regs_struct *)0)->orig_rax,                                                 
#else                                                                                                                             
                                child, &((struct user_regs_struct *)0)->orig_eax,                                                 
#endif                                                                                                                            
                                NULL);                                                                                            
                printf("The child made a "                                                                                       
                        "system call %ld\n", orig_eax);                                                                           
                ptrace (PTRACE_CONT, child, NULL, NULL);                                                                          
        }                                                                                                                         
        return 0;                                                                                                                 
}

Here's result from the code

~/my-sandbox/ptrace$ file s1 && ./s1
s1: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=f84894c2f5373051682858937bf54a66f21cbeb4, for GNU/Linux 3.2.0, not stripped
The child made a system call 59

~/my-sandbox/ptrace$ file s2 && ./s2
s2: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=cac6a2bbeee164e27c11764c1b68f4ddd06405cf, for GNU/Linux 3.2.0, with debug_info, not stripped
The child made a system call 59

This is what I got from 32bit executable using gdb. As you can see it's using /lib/i386-linux-gnu/libc.so.6, and the system call number for execve is 11.

>>> bt
#0  0xf7e875a0 in execve () from /lib/i386-linux-gnu/libc.so.6
#1  0xf7e8799f in execl () from /lib/i386-linux-gnu/libc.so.6
#2  0x565562a4 in main () at simple1.c:15
>>> disassemble
Dump of assembler code for function execve:
=> 0xf7e875a0 <+0>: endbr32 
   0xf7e875a4 <+4>: push   %ebx
   0xf7e875a5 <+5>: mov    0x10(%esp),%edx
   0xf7e875a9 <+9>: mov    0xc(%esp),%ecx
   0xf7e875ad <+13>:    mov    0x8(%esp),%ebx
   0xf7e875b1 <+17>:    mov    $0xb,%eax
   0xf7e875b6 <+22>:    call   *%gs:0x10
   0xf7e875bd <+29>:    pop    %ebx
   0xf7e875be <+30>:    cmp    $0xfffff001,%eax
   0xf7e875c3 <+35>:    jae    0xf7dd9000
   0xf7e875c9 <+41>:    ret    
End of assembler dump.

Solution

  • execve is special; it's the only one that has special interaction with PTRACE_TRACEME. The way strace works, other system calls do show the 32-bit call number. (And modern strace needs special help to know whether that's a 32-bit call number for int 0x80 / sysenter, or a 64-bit call number, since 64-bit processes can still invoke int 0x80, although they normally shouldn't. This support was only added in 2019, with PTRACE_GET_SYSCALL_INFO)


    You're right, when the kernel is actually invoked, EAX holds 11, __NR_execve from unistd_32.h. It's set by mov $0xb,%eax before glibc's execve wrapper jumps to the VDSO page to enter the kernel via whatever efficient method is supported on this hardware (normally sysenter.)

    But execution doesn't actually stop until it reaches some code in the main execve implementation that checks for PTRACE_TRACEME and raises SIGTRAP.

    Apparently sometime before that happens, it calls void set_personality_64bit(void) in arch/x86/kernel/process_64.c, which includes

        /* Pretend that this comes from a 64bit execve */
        task_pt_regs(current)->orig_ax = __NR_execve;
    

    I found that by searching for __NR_execve in a kernel source browser, and looking at the most likely file in arch/x86. I didn't keep cross-referencing to find where that's called from; the fact that it exists (and the assumption of a sane non-obfuscated design) points very strongly to this being the answer to your mystery.