Search code examples
cgccassemblyelfgnu-assembler

How to get the size of a C function from inside a C program or with inline assembly?


Suppose I have a function like below:

# cat 003.c

int foo(int a, int b)
{
    return a+b;
}

And compile it like this:

gcc -S 003.c

The gets the following assembly result:

     .file   "003.c"
     .text
 .globl foo
     .type   foo, @function
 foo:
 .LFB2:
     pushq   %rbp
 .LCFI0:
     movq    %rsp, %rbp
 .LCFI1:
     movl    %edi, -4(%rbp)
     movl    %esi, -8(%rbp)
     movl    -8(%rbp), %edx
     movl    -4(%rbp), %eax
     addl    %edx, %eax
     leave
     ret
 .LFE2:
     .size   foo, .-foo /* size of the function foo, how to get it?*/

The last line above do get the size of the function. Where does the compiler store the size? Can I get the function's size in some way in my origin C program using C or inline asm?


Solution

  • The information about a function size is stored in the ELF Attributes for the corresponding symbol (name). C example code how to parse this programmatically is at the bottom of the Solaris manpage forgelf_getsym(3ELF) (libelf does exist in Linux, *BSD and MacOS as well, you need to look for the st_size field of the GElf_Sym structure), but you also can use objdump / elfdump (Solaris) / readelf (Linux) for the task:

    $ objdump -h -d --section=.text foo3.o
    
    foo3.o:     file format elf64-x86-64
    
    Sections:
    Idx Name          Size      VMA               LMA               File off  Algn
      0 .text         00000012  0000000000000000  0000000000000000  00000040  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
    [ ... ]
    Disassembly of section .text:
    
    0000000000000000 <foo>:
       0:   55                      push   %rbp
       1:   48 89 e5                mov    %rsp,%rbp
       4:   89 7d fc                mov    %edi,0xfffffffffffffffc(%rbp)
       7:   89 75 f8                mov    %esi,0xfffffffffffffff8(%rbp)
       a:   8b 45 f8                mov    0xfffffffffffffff8(%rbp),%eax
       d:   03 45 fc                add    0xfffffffffffffffc(%rbp),%eax
      10:   c9                      leaveq
      11:   c3                      retq

    This is for an unoptimized compile of your code, while the optimized version is:

    $ objdump -h -d --section=.text foo3.o
    
    foo3.o:     file format elf64-x86-64
    
    Sections:
    Idx Name          Size      VMA               LMA               File off  Algn
      0 .text         00000004  0000000000000000  0000000000000000  00000040  2**4
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
    [ ... ]
    Disassembly of section .text:
    
    0000000000000000 <foo>:
       0:   8d 04 37                lea    (%rdi,%rsi,1),%eax
       3:   c3                      retq

    Note the "Size" change from 0x12 to 4 ? That's what comes from the .size assembler directive.

    The "trick" of trying to use inline assembly to give you function sizes / code locations isn't accounting for compiler-generated glue code (function entry prologues / exit epilogues, inline code generation, ...), nor for the compiler re-ordering inline assembly (gcc is notorious to do so), hence it's not generally a great idea to trust this. In the end, it depends on what exactly you're trying to do ...

    Edit: A few more references, external as well as on stackoverflow:

    1. From the gcc mailing list, thread on sizeof(function)
    2. what does sizeof (function name) return?
    3. Find size of a function in C
    4. LibELF by example sourceforge project (this is documentation / a tutorial)