Search code examples
assemblycompilationx86binarycrt

What is R_386_32 type symbols in C runtime library?


I read Here.
As far as I understand, R_386_32 is for static data, and R_386_PC32 is for function. right?

However I'm still confusing about the usage of R_386_32 type symbol.
See below example.

Example 1

readelf -a --wide /usr/lib/i386-linux-gnu/crt1.o | grep R_386_32
0000000c  00000901 R_386_32               00000000   __libc_csu_fini
00000011  00000b01 R_386_32               00000000   __libc_csu_init
00000018  00000c01 R_386_32               00000000   main

Example 2

readelf -a --wide /usr/local/lib/gcc/i686-pc-linux-gnu/5.5.0/crtbegin.o 
00000001  00001501 R_386_32               00000000   __TMC_END__
00000010  00001601 R_386_32               00000000   _ITM_deregisterTMCloneTable
00000031  00001501 R_386_32               00000000   __TMC_END__
00000049  00001701 R_386_32               00000000   _ITM_registerTMCloneTable
000000a1  00001901 R_386_32               00000000   _Jv_RegisterClasses


Question

  1. In Example 2, R_386_32 typed datas are automatically added to Application
    while in compile time?
  2. If yes, can I reference those data in my code?
    For example, Can I make the Application that printf the value of _Jv_RegisterClasses?
  3. In Example 1, Why main is R_386_32 type?
    I think It should be R_386_PC32, because it is not static data, it is function.

Solution

  • You are looking at relocations, not symbols. Relocations are just what the assembler generates when it wants to refer to a symbol whose value is unknown; it's an instruction to the linker to fill in the correct value at link time. Relocations are not symbol types; each symbol can be referred to through an arbitrary amount of relocations of arbitrary type. Note also that the symbol table does not know what type of datum a symbol refers to, if at all. A symbol is just an address and a name.

    The relocation type R_386_32 just means “paste the value of the symbol as 32 bit here.” There is no way to say if the symbol used is for data or text. This is used for example, if you load the address of a symbol or perform an absolute memory access. Both of these instructions generate a R_386_32 relocation:

    mov $foo, %eax       # move value of symbol to register
    mov foo, %eax        # perform absolute memory access
    

    On the other hand, the relocation type R_386_PC32 subtracts the value of the instruction pointer (program counter) from the symbol and pastes that. This relocation type is mainly used for direct jump and call instructions:

    jmp foo              # jump to foo
    call foo             # call foo
    

    In general, there is no way to guess what section a symbol is defined in from looking at relocations. Indeed, the relocations do not give any information about this at all and an object file cannot demand that an external symbol refers to data or text. For defined symbols, you can find out what section they are in by running the nm utility. Symbols marked t or T are text, d or D are data, r or R are read-only data, and b or B are BSS.

    For your second question: yes, you can. Use C code like this to print the value of _Jv_RegisterClasses. Note that the value of a symbol is the address of the variable it refers to.

    extern const void _Jv_RegisterClasses;  /* or any other type */
    
    printf("%p\n", &_Jv_RegisterClasses);   /* print value of symbol */