Search code examples
clinuxassemblycpusignal-handling

What happens at CPU-Level if you dereference a null pointer?


Suppose I have following program:

#include <signal.h>
#include <stddef.h>
#include <stdlib.h>

static void myHandler(int sig){
        abort();
}

int main(void){
        signal(SIGSEGV,myHandler);
        char* ptr=NULL;
        *ptr='a';
        return 0;
}

As you can see, I register a signalhandler and some lines further, I dereference a null pointer ==> SIGSEGV is triggered. But how is it triggered? If I run it using strace (Output stripped):

//Set signal handler (In glibc signal simply wraps a call to sigaction)
rt_sigaction(SIGSEGV, {sa_handler=0x563b125e1060, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7ffbe4fe0d30}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
//SIGSEGV is raised
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [SEGV], 8) = 0

But something is missing, how does a signal go from the CPU to the program? My understanding:

[Dereferences null pointer] -> [CPU raises an exception] -> [??? (How does it go from the CPU to the kernel?) ] -> [The kernel is notified, and sends the signal to the process] -> [??? (How does the process know, that a signal is raised?)] -> [The matching signal handler is called].

What happens at these two places marked with ????


Solution

  • A NULL pointer in most (but not all) C implementations is address 0. Normally this address is not in a valid (mapped) page.

    Any access to a virtual page that's not mapped by the HW page tables results in a page-fault exception. e.g. on x86, #PF.

    This invokes the OS's page-fault exception handler to resolve the situation. On x86-64 for example, the CPU pushes exception-return info on the kernel stack and loads a CS:RIP from the IDT (Interrupt Descriptor Table) entry that corresponds to that exception number. Just like any other exception triggered by user-space, e.g. integer divide by zero (#DE), or a General Protection fault #GP (trying to run a privileged instruction in user-space, or a misaligned SIMD instruction that required alignment, or many other possible things).

    The page-fault handler can find out what address user-space tried to access. e.g. on x86, there's a control register (CR2) that holds the linear (virtual) address that caused the fault. The OS can get a copy of that into a general-purpose register with mov rax, cr2.

    Other ISAs have other mechanisms for the OS to tell the CPU where its page-fault handler is, and for that handler to find out what address user-space was trying to access. But it's pretty universal for systems with virtual memory to have essentially equivalent mechanisms.


    The access is not yet known to be invalid. There are several reasons why an OS might not have bothered to "wire" a process's allocated memory into the hardware page tables. This is what paging is all about: letting the OS correct the situation, like copy-on-write, lazy allocation, or bringing a page back in from swap space.

    Page faults come in three categories: (copied from my answer on another question). Wikipedia's page-fault article says similar things.

    • valid (the process logically has the memory mapped, but the OS was lazy or playing tricks like copy-on-write):
      • hard: the page needs to be paged in from disk, either from swap space or from a disk file (e.g. a memory mapped file, like a page of an executable or shared library). Usually the OS will schedule another task while waiting for I/O: this is the key difference between hard (major) and soft (minor).
      • soft: No disk access required, just for example allocating + zeroing a new physical page to back a virtual page that user-space just tried to write. Or copy-on-write of a writeable page that multiple processes had mapped, but where changes by one shouldn't be visible to the other (like mmap(MAP_PRIVATE)). This turns a shared page into a private dirty page.
    • invalid: There wasn't even a logical mapping for that page. A POSIX OS like Linux will deliver SIGSEGV signal to the offending process/thread.

    So only after the OS consults its own data structures to see which virtual addresses a process is supposed to own can it be sure that the memory access was invalid.

    Deciding whether a page fault is invalid or not is completely up to software. As I wrote on Why page faults are usually handled by the OS, not hardware? - if the HW could figure everything out, it wouldn't need to trap to the OS.

    Fun fact: on Linux it's possible to configure the system so virtual address 0 is (or can be) valid. Setting mmap_min_addr = 0 allows processes to mmap there. e.g. WINE needs this for emulating a 16-bit Windows memory layout.

    Since that wouldn't change the internal object-representation of a NULL pointer to be other than 0, doing that would mean that NULL dereference would no longer fault. That makes debugging harder, which is why the default for mmap_min_addr is 64k.


    On a simpler system without virtual memory, the OS might still be able to configure an MMU to trap on memory access to certain regions of address space. The OS's trap handler doesn't have to check anything, it knows any access that triggered it was invalid. (Unless it's also emulating something for some regions of address space...)


    Delivering a signal to user-space

    This part is pure software. Delivering SIGSEGV is no different than delivering SIGALRM or SIGTERM sent by another process.

    Of course, a user-space process that just returns from a SIGSEGV handler without fixing the problem will make the main thread re-run the same faulting instruction again. (The OS would return to the instruction that raised the page-fault exception.)

    This is why the default action for SIGSEGV is to terminate, and why it doesn't make sense to set the behaviour to "ignore".