Search code examples
cprofilingexecution-timeoprofile

How to use oprofile to calculate execution time of a part of C program?


I want to profile a portion of the C code (user_defined_function())using and calculate the time taken to execute it. Any pointers on how to do this would be very helpful. Thanks in advance!!

#include <stdio.h>
int main()  
{  
    //some statements;

    //Begin Profiling  
    user_defined_function();  
    //End Profiling  

    //some statements;
    return 0;  
}  

Solution

  • I don't see turn on / turn off markers in the http://oprofile.sourceforge.net/doc/index.html and http://oprofile.sourceforge.net/faq/ documentation. Probably calling (fork+exec) opcontrol with --start and --stop will help if the code to be profiled is long enough.

    With perf tool in profiling (sampling) mode perf record (and/or probably operf which is based on the same perf_event_open syscall) you can try to profile full program and add some markers at Begin Profiling and End Profiling points (by using some custom tracing event), then you can dump entire perf.data with perf script, find events of your markers and cut only part of the profile between markers (every event in perf.data has timestamp and they are ordered or can be sorted by time).

    With direct use of perf_event_open syscall you can enable and disable profiling from the same process with ioctl calls described in the "man 2 perf_event_open" page on the fd descriptor of perf with PERF_EVENT_IOC_ENABLE / PERF_EVENT_IOC_DISABLE actions. Man page also lists using prctl to temporary disable and reenable profiling on the program (this may even work with oprofile, disable at start of main, enable at Begin, disable at End)

    Using prctl(2) A process can enable or disable all the event groups that are attached to it using the prctl(2) PR_TASK_PERF_EVENTS_ENABLE and PR_TASK_PERF_EVENTS_DISABLE operations.

    Another way of using performance counter is not sampling profiling, but counting (perf stat ./your_program / perf stat -d ./your_program does this). This mode will not give you list of 'hot' functions, it just will say that your code did 100 millions of instructions in 130 millions of cycles, with 10 mln L1 cache hits and 5 mln L1 cache misses. There are wrappers to enable counting on parts of program, for example: PAPI http://icl.cs.utk.edu/papi/ (PAPI_start_counters), perfmon2 (libpfm3,libpfm4), https://github.com/RRZE-HPC/likwid (pdf, likwid_markerStartRegion), http://halobates.de/jevents.html & http://halobates.de/simple-pmu, etc..