The idea of my eBPF program is to trace datas on some schedule related tracpoints such as shced_wakeup.
For some reasons I need to know when these tracepoints are triggered, in which cgroup it happens.
To achieve that, I've found a way to get the cgroup name through bpf_get_current_task() -> cgroups -> subsys -> cgroup -> kn -> name
. And the name is a variable of type char *
.
So I want to create a output BPF map to my main golang program. The key of this map is type char *
to store one cgroup's name(basically its file system path), and the value of this map is type u64
, for example.
It looks like that the map does return some address value such as 0x00000000
(just a random address). So in golang I use cilium/ebpf to take it with a var cgroupName unsafe.Pointer
. When I want to print it out, I use *(*string)(cgroupName)
, but it only print out a ''(nil value).
Is this because of the address is in kernel space, or BPF stack, or any address that my golang program(apparently in user space) can not access? Or is there anything wrong with my whole idea?
To make it more clear, you can refer to belowing code:
bpf.c
#include "vmlinux.h"
#include "bpf_helpers.h"
#include "bpf_core_read.h"
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 10240);
__type(key, u32);
__type(value, char *);
} pid_cgroup_name SEC(".maps");
SEC("tp/sched/sched_wakeup")
int handle__sched_wakeup(struct sched_wakeup_tp_args *ctx)
{
struct task_struct *task = (void *)bpf_get_current_task();
return trace_enqueue(task);
}
static __always_inline
int trace_enqueue(struct task_struct *task)
{
u32 pid;
struct css_set *cgroups;
struct cgroup_subsys_state *subsys[14];
struct cgroup *cg;
struct kernfs_node *kn;
char *cgroup_name;
bpf_core_read(&cgroups, sizeof(cgroups), &task->cgroups);
bpf_core_read(&subsys, sizeof(subsys), &cgroups->subsys);
bpf_core_read(&cg, sizeof(cg), &subsys[1]->cgroup);
bpf_core_read(&kn, sizeof(kn), &cg->kn);
bpf_core_read(&cgroup_name,sizeof(cgroup_name),&kn->name);
if (!cgroup_name)
return 0;
bpf_core_read(&pid, sizeof(pid), &task->tgid);
bpf_map_update_elem(&pid_cgroup_name, &pid, &cgroup_name, 0);
return 0;
}
main.go
package main
import (
"C"
"github.com/cilium/ebpf/link"
"github.com/cilium/ebpf/rlimit"
"log"
"time"
)
// $BPF_CLANG and $BPF_CFLAGS are set by the Makefile.
//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -cc
$BPF_CLANG -cflags $BPF_CFLAGS bpf bpf.c -- -I../headers -
I../csl-headers
func main() {
// Allow the current process to lock memory for eBPF resources.
if err := rlimit.RemoveMemlock(); err != nil {
log.Fatal(err)
}
// Load pre-compiled programs and maps into the kernel.
objs := bpfObjects{}
if err := loadBpfObjects(&objs, nil); err != nil {
log.Fatalf("loading objects: %v", err)
}
defer objs.Close()
tpWakeup, err := link.Tracepoint("sched", "sched_wakeup", objs.HandleSchedWakeup, nil)
if err != nil {
log.Fatalf("opening tracepoint: %s", err)
}
defer tpWakeup.Close()
ticker := time.NewTicker(2 * time.Second)
defer ticker.Stop()
log.Println("Waiting for events..")
for range ticker.C {
mapIterator := objs.PidCgroupName.Iterate()
var pid, uint32
for mapIterator.Next(&pid, cgroupName) {
log.Printf("get pid %v for cgroup name: %s", pid, cgroupName)
}
}
}
Because I use cilium/ebpf to write main.go, to run main.go successfully, a go generate
command will produce bpf_bpfel.go code.
Then you can use command go run main.go bpf_bpfel.go
to see some results.
It looks like this:
2023/03/08 16:20:11 get pid 12345 for cgroup name:
You can see that cgroup name prints out nothing.
One issue here is that you are trying to pass a kernel pointer to userspace and are expecting that to work. I can't tell from the code you submitted what type cgroupName
is, but in any case it seems like you are not dereferencing the pointer to the C string since that would almost certainly cause a SEGFAULT.
Instead, you should copy the string. Start by changing your map type over to an array with some max capacity
#define MAX_SIZE 128
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 10240);
__type(key, u32);
__type(value, char[MAX_SIZE]);
} pid_cgroup_name SEC(".maps");
Then in trace_enqueue
we also change cgroup_name
to be an array of the same size. We can use the bpf_core_read_str
function to do a CO:RE read for the string, giving it the max size of out array. And we can then write the array into the map.
static __always_inline
int trace_enqueue(struct task_struct *task)
{
u32 pid;
struct css_set *cgroups;
struct cgroup_subsys_state *subsys[14];
struct cgroup *cg;
struct kernfs_node *kn;
char cgroup_name[MAX_SIZE];
long name_len;
bpf_core_read(&cgroups, sizeof(cgroups), &task->cgroups);
bpf_core_read(&subsys, sizeof(subsys), &cgroups->subsys);
bpf_core_read(&cg, sizeof(cg), &subsys[1]->cgroup);
bpf_core_read(&kn, sizeof(kn), &cg->kn);
name_len = bpf_core_read_str(&cgroup_name, MAX_SIZE, &kn->name);
if (name_len < 0)
return 0;
bpf_core_read(&pid, sizeof(pid), &task->tgid);
bpf_map_update_elem(&pid_cgroup_name, &pid, &cgroup_name, 0);
return 0;
}
On the Go side the map value can be interpreted as [128]byte. You can cast it to a slice, then use ByteSliceToString to strip the null bytes and convert it to a string.