I'm setting up profiling for a software I've written but I am not able to get the context-switch count working using perf_event_open
.
To test the problem, I tried using the sample code provided on the perf_event_open
man_page
as well. Using sched_yield
and running a parallel process on the same core using taskset to force context-switches. The count for context-switches using perf_event_open()
still remains 0. (While using perf stat I get non-zero numbers : in the thousands for large loops). I've tried doing a file read / using mmap to force page faults as well.
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/ioctl.h>
#include <linux/perf_event.h>
#include <asm/unistd.h>
#include <iostream>
#include <string.h>
#include <sys/mman.h>
using namespace std;
int buf_size_shift = 8;
static unsigned perf_mmap_size(int buf_size_shift)
{
return ((1U << buf_size_shift) + 1) * sysconf(_SC_PAGESIZE);
}
static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
int cpu, int group_fd, unsigned long flags)
{
int ret;
ret = syscall(__NR_perf_event_open, hw_event, pid, cpu,
group_fd, flags);
return ret;
}
int main(int argc, char **argv)
{
struct perf_event_attr pe;
long long count;
int fd;
memset(&pe, 0, sizeof(struct perf_event_attr));
pe.type = PERF_TYPE_SOFTWARE;
//pe.sample_type = PERF_SAMPLE_CALLCHAIN; /* this is what allows you to obtain callchains */
pe.size = sizeof(struct perf_event_attr);
pe.config = PERF_COUNT_SW_CONTEXT_SWITCHES;
pe.disabled = 1;
pe.exclude_kernel = 1;
pe.sample_period = 1000;
pe.exclude_hv = 1;
fd = perf_event_open(&pe, 0, -1, -1, 0);
if (fd == -1) {
fprintf(stderr, "Error opening leader %llx\n", pe.config);
exit(EXIT_FAILURE);
}
/* associate a buffer with the file */
struct perf_event_mmap_page *mpage;
mpage = (perf_event_mmap_page*) mmap(NULL, perf_mmap_size(buf_size_shift),
PROT_READ|PROT_WRITE, MAP_SHARED,
fd, 0);
if (mpage == (struct perf_event_mmap_page *)-1L) {
close(fd);
return -1;
}
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
printf("Measuring instruction count for this printf\n");
long long sum = 0;
for (long long i = 0; i < 10000000000; i++) {
sum += i;
if (i%1000000 == 0)
cout << i << " : " << sum << endl;
}
ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
read(fd, &count, sizeof(long long));
printf("Used %lld cs\n", count);
close(fd);
}
This code for type = PERF_COUNT_SOFTWARE
and config = PERF_COUNT_SW_CONTEXT_SWITCHES
outputs 0 in the count even with forced context-switches. While other metrics are working.
On using the mmap ring buffer, I see PERF_RECORD_SWITCH
records on reading it, while according to my understanding is that context-switch events are being recorded.
Any information on how the perf count and the data in the ring buffer is related is also appreciated.
The events are not counted because you disable events from the kernel (exclude_kernel = 1;
), and PERF_TYPE_SOFTWARE
events are generally provided by the kernel.
If you remove the exclude_kernel
, the events are counted.
The connection between the count and the recorded events in the ring buffer is the sample_period
. Your setting of pe.sample_period = 1000;
means that every 1000 switch events, a PERF_RECORD_SAMPLE
event is written to the ring buffer.
The following example to read the buffer is only to illustrate the general approach. In practice you need to handle events that wrap around the end of the buffer and do more consistency checks.
auto tail = mpage->data_tail;
const auto head = mpage->data_head;
const auto size = mpage->data_size;
char* data = reinterpret_cast<char*>(mpage) + sysconf(_SC_PAGESIZE);
int events = 0;
while (true) {
if (tail >= head) break;
auto event_header_p = (struct perf_event_header*)(data + (tail % size));
std::cout << "event << " << event_header_p->type << ", size: " << event_header_p->size << "\n";
tail += event_header_p->size;
events++;
}
You should find a corresponding number of events of type PERF_RECORD_SAMPLE == 9
in the buffer (unless there is an overflow). If you want to read them, you need to cast the pointer to an appropriate struct. The actual layout of PERF_RECORD_SAMPLE
events - or any other events - depends on your perf_event_attr
configuration and is documented in perf_event_open
.