Search code examples
gcclinux-kernelkernelftrace

Linux-ftrace: why code profiling is achieved through mcount function (gcc `-pg` option)?


With this question I aim to do a survey about instrumentation techniques used by linux ftrace. According with ftrace.txt:

If CONFIG_DYNAMIC_FTRACE is set, the system will run with virtually no overhead when function tracing is disabled. The way this works is the mcount function call (placed at the start of every kernel function, produced by the -pg switch in gcc), starts of pointing to a simple return. (Enabling FTRACE will include the -pg switch in the compiling of the kernel.)

mcount calls happens just before or just after instrumented functions' prologue (to the best of my knowledge, whether "before" or "after" depends on how glibc implements the mcount function on your specific architecture.

However, this is not enough if we use the function graph tracer of ftrace. Such tracer is able to trace both entry and exit of the function. Using the mcount mechanism to capture the exit assembly routine of a function requires some tricky manipulation of the stack and call sequence More details on: ftrace-design.txt.

Briefly, since the -pg compiler option only adds instrumentation for function entry, ftrace subsystem needs to adjust the register and stack conditions before returning to execute the instrumented function, so that ftrace can regain control when the function exits.

I found this process complex, especially when we need to instrument also the end of a function. In this question, I wonder why the kernel is compiled with -pg option of gcc instead of -finstrument-functions option. The latter would avoid the above mentioned process of saving the return address. From GCC GNU docs (see paragraph -finstrument-functions) I found such option more friendly than the -pg's one. Here a little excerpt:

-finstrument-functions Generate instrumentation calls for entry and exit to functions. Just after function entry and just before function exit, the following profiling functions are called with the address of the current function and its call site.

void __cyg_profile_func_enter (void *this_fn, void *call_site);

void __cyg_profile_func_exit (void *this_fn, void *call_site);


Solution

  • Actually, if you look up the mcount() symbol in the latest Linux kernel source, you will find out that the mcount() (which may be present in the xxx_entry_xx.S) was no longer used. The modern processor has provided more useful profiling instruction than the compiler (which you have already mentioned above). The function_graph_tracer is a superset of function_tracer, the following message is documented in some old manual:

    The mcount function should check the function pointers ftrace_graph_return (compare to ftrace_stub) and ftrace_graph_entry (compare to ftrace_graph_entry_stub). If either of those is not set to the relevant stub function, call the arch-specific function ftrace_graph_caller which in turn calls the arch-specific function prepare_ftrace_return. Neither of these function names is strictly required, but you should use them anyway to stay consistent across the architecture ports – easier to compare & contrast things.

    So in conclusion, the mcount is not only check the flag of function_tracer but the graph_tracer, it will do some restore and register work to meet the needs of stack trace.