Search code examples
traceebpfkprobebcc-bpf

attaching bpf to sys_enter (tracepoint available through /proc/kallsyms)


I'm trying to build a tool wherein I attach a BPF program to the entry points for all syscalls. From the CLI, I was able to attach to all syscall entries via

sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* /comm != "bpftrace"/ {printf("Process Name: %s\nSyscall Requested: %s\n", comm, probe);}'

which is great, but I want to do more complex stuff. I've found that I can attach BPF programs to kprobe events using python's front end as such --

#!/usr/bin/python

from bcc import BPF

prog = """
int hello(void *ctx){
    bpf_trace_printk("Hello, world!\\n");
    return 0;
}
"""

b = BPF(text=prog)
b.attach_kprobe(event="__x64_sys_clone", fn_name="hello)

print("TIME(s)", "COMM", "PID", "MESSAGE")

while 1:
    try:
        (task, pid, cpu, flags, ts, msg) = b.trace_fields()
    except ValueError:
        continue
    except KeyboardInterrupt:
        exit()
    printb(b"%-18.9f %-16s %-6d %s" %(ts, task, pid, msg))

However, in the attach_kprobe line I want to attach to all syscall entries rather than sys_clone. I didn't find any sys_enter tracepoints in /sys/kernel/debug/tracing/available_filter_functions however, I found __tracepoint_sys_enter in /proc/kallsyms. However, when I tried replacing __x64_sys_clone with __tracepoint_sys_enter I get an invalid argument error. I'm wondering, can I attach to all syscall entry (and eventually exit) using kprobes? Or do I need to use a different tracing mechanism. Thanks!


Solution

  • There doesn't seem to be a kprobes event that captures all syscall entry points -- rather there seems to be a kprobe event for each syscall entry. While we can code the required logic by applying to each kprobe syscall entry event (specifically, by using methodology outlined by pchaigno), we can do the same by attaching to a single TRACEPOINT event as so --

    from bcc import BPF
    b = BPF(text = """
    TRACEPOINT_PROBE(raw syscalls, sys_enter)
    {
        bpf_trace_printk("Hello world\\n");
    }
    """)
    
    while 1:
        try:
            (task, pid, cpu, flags, ts, msg) = b.trace_fields()
        except ValueError:
            continue
        print("%-18.9f %-16s %-6d %s" % (ts, task, pid, msg))
    

    Similarly, we can attach to all syscall exit points