Search code examples
clinuxcompiler-constructiondata-segment

Why is the .bss segment required?


What I know is that global and static variables are stored in the .data segment, and uninitialized data are in the .bss segment. What I don't understand is why do we have dedicated segment for uninitialized variables? If an uninitialized variable has a value assigned at run time, does the variable exist still in the .bss segment only?

In the following program, a is in the .data segment, and b is in the .bss segment; is that correct? Kindly correct me if my understanding is wrong.

#include <stdio.h>
#include <stdlib.h>

int a[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9};
int b[20]; /* Uninitialized, so in the .bss and will not occupy space for 20 * sizeof (int) */

int main ()
{
   ;
}  

Also, consider following program,

#include <stdio.h>
#include <stdlib.h>
int var[10];  /* Uninitialized so in .bss */
int main ()
{
   var[0] = 20  /* **Initialized, where this 'var' will be ?** */
}

  

Solution

  • The reason is to reduce program size. Imagine that your C program runs on an embedded system, where the code and all constants are saved in true ROM (flash memory). In such systems, an initial "copy-down" must be executed to set all static storage duration objects, before main() is called. It will typically go like this pseudo:

    for(i=0; i<all_explicitly_initialized_objects; i++)
    {
      .data[i] = init_value[i];
    }
    
    memset(.bss, 
           0, 
           all_implicitly_initialized_objects);
    

    Where .data and .bss are stored in RAM, but init_value is stored in ROM. If it had been one segment, then the ROM had to be filled up with a lot of zeroes, increasing ROM size significantly.

    RAM-based executables work similarly, though of course they have no true ROM.

    Also, memset is likely some very efficient inline assembler, meaning that the startup copy-down can be executed faster.