Pardon me if this question is too trivial! It is known that the final executable does not allocate space for the uninitialized data within the image. But I want to know, how are the references to the symbols with in .bss get resolved?
Does the object file contain only the addresses of these variables of .bss some where else and NOT allocate space for them? If so, where are these resolved addresses are stored?
For eg. if in a C module I have something like following global variables -
int x[10]; char chArray[100];
The space for above variables may not be present in the image, but how one will reference them? Where are their addresses resolved?
Thanks in advance! /MS
.bss symbols get resolved just like any other symbol the compiler (or assembler) generates. Usually this works by placing related symbols in "sections". For example, a compiler might place program code in a section called ".text" (for historical reasons ;-), initialized data in a section called, ".data", and unitiatialzed data in a section called .".bss".
For example:
int i = 4;
int x[10];
char chArray[100];
int main(int argc, char**argv)
{
}
produces (with gcc -S):
.file "test.c"
.globl i
.data
.align 4
.type i, @object
.size i, 4
i:
.long 4
.text
.globl main
.type main, @function
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $4, %esp
addl $4, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
.size main, .-main
.comm x,40,32
.comm chArray,100,32
.ident "GCC: (GNU) 4.3.2 20081105 (Red Hat 4.3.2-7)"
.section .note.GNU-stack,"",@progbits
The .data directive tells the assembler to place i in the data section, the ".long 4" gives it its initial value. When the file is assembled, i will be defined at offset 0 in the data section.
The .text directive will place main in the .text section, again with an offset of zero.
The interesting thing about this example is that x and chArray are defined using the .comm directive, not placed in .bss directly. Both are given only a size, not an offset (yet).
When the linker gets the object files, it links them together by combining all the sections with the same name and adjusting the symbol offsets accordingly. It also gives each section an absolute address at which it should be loaded.
The symbols defined by the .comm directive are combined (if multiple definitions with the same name exist) and placed in the .bss section. It's at this point that they are given their address.