I wrote a BPF object file which included a section and a static inlined function, which defined as below:
static inline __attribute__((always_inline)) bpf_call_func(...);
__section("entry") bpf_func(...); // called bpf_call_func
It worked well and when I used llvm-objdump, it showed that bpf_call_func
has already been inlined.
But when I defined another section in the same object file and called bpf_call_func
static inline __attribute__((always_inline)) bpf_call_func(...);
__section("entry") bpf_func(...); // called bpf_call_func
__section("entry2") bpf_func2(...); // called bpf_call_func
llvm-objdump showed bpf_call_func
didn't inlined in neither bpf_func
nor bpf_func2
. It just defined in the .text
section, and bpf_func
and bpf_func2
used call
instruction to call bpf_call_func
.
The bpf_call_func
is about 600 instructions. The bpf_func
and bpf_func
are about 250 instructions.
I viewed gcc manual, it says:
Note that certain usages in a function definition can make it unsuitable for inline substitution. Among these usages are: variadic functions, use of alloca, use of computed goto (see Labels as Values), use of nonlocal goto, use of nested functions, use of setjmp, use of __builtin_longjmp and use of __builtin_return or __builtin_apply_args. Using -Winline warns when a function marked inline could not be substituted, and gives the reason for the failure.
But I didn't know which condition matches my case.
I wonder why the bpf_call_func
doesn't inline when two sections are calling it?
Is it related to bpf_call_func
's instruction number?
From what I can find there is not way to actually force clang to inline a function, this is the clang reference for always_inline:
Inlining heuristics are disabled and inlining is always attempted regardless of optimization level.
Does not guarantee that inline substitution actually occurs.
This seems to be a clang thing since GCC states that it will always inline like the attribute suggests, or throw an error(for calls within a unit):
always_inline
Generally, functions are not inlined unless optimization is specified. For functions declared inline, this attribute inlines the function independent of any restrictions that otherwise apply to inlining. Failure to inline such a function is diagnosed as an error. Note that if such a function is called indirectly the compiler may or may not inline it depending on optimization level and a failure to inline an indirect call may or may not be diagnosed.
GCC provides a -Winline
flag so the compiler warns about functions that were not inlined, but clang ignores this:
-Winline
This diagnostic flag exists for GCC compatibility, and has no effect in Clang.
So, it seems that clang treats the always_inline attribute as a hint and will happily not inline functions without error or warning. In your case it likely decided that your inline function is to large.
And to be fair, unless you need to support kernels lower than 4.16 it doesn't matter that much, since eBPF supports functions calls nowadays.