Search code examples
linkerldelflinker-scripts

ld ignores size of nobits input section


When working on a small 32-bit kernel for the x86 architecture I discovered something strange with how ld handles nobits sections.

In my kernel I define a .bootstrap_stack section which holds a temporary stack for the initialisation part of the system. I also hold symbols for the beginning and end of the stack. This input section is redirected to the .bss output section. Each output section of my kernel has a symbol for the beginning and end of the section.

The problem is that in the final executable the symbol for the end of the stack is after the end of the .bss section. In the below examples the symbols stack_top and _kernel_ebss (and _kernel_end) have the same value, which isn't what I wanted.

I expected _kernel_ebss to equal stack_bottom.

However once I rename .bootstrap_stack to .bss this does not happen. Removing nobits also works, but the resulting binary is considerably larger.

Here are the stripped files that reproduce my issue:

boot.s

section .bootstrap_stack, nobits ; this does not work
;section .bootstrap_stack        ; this works
;section .bss                    ; this also works

stack_top:
resb 8096
stack_bottom:

section .text
global _start
_start:
    hlt
    jmp _start

linker.ld

ENTRY(_start)

SECTIONS
{
    . = 0xC0100000;

    _kernel_start = .;

    .text ALIGN(4K) : AT(ADDR(.text) - 0xC0000000)
    {
        _kernel_text = .;
        *(.multiboot)
        *(.text)
        _kernel_etext = .;
    }

    .bss ALIGN(4K) : AT(ADDR(.bss) - 0xC0000000)
    {
        _kernel_bss = .;
        *(COMMON)
        *(.bss)
        *(.bootstrap_stack)
        _kernel_ebss = .;
    }
    _kernel_end = .;
}

Here are the symbols:

$ objdump -t kernel | sort
00000000 l    df *ABS*              00000000 boot.s
c0100000 g       .text              00000000 _kernel_start
c0100000 g       .text              00000000 _kernel_text
c0100000 g       .text              00000000 _start
c0100000 l    d  .text              00000000 .text
c0100003 g       .text              00000000 _kernel_etext
c0101000 g       .text              00000000 _kernel_bss
c0101000 g       .text              00000000 _kernel_ebss
c0101000 g       .text              00000000 _kernel_end
c0101000 l       .bootstrap_stack,  00000000 stack_top
c0101000 l    d  .bootstrap_stack,  00000000 .bootstrap_stack,
c0102fa0 l       .bootstrap_stack,  00000000 stack_bottom

By renaming .bootstrap_stack to .bss I get what I expected.

00000000 l    df *ABS*  00000000 boot.s
c0100000 g       .text  00000000 _kernel_start
c0100000 g       .text  00000000 _kernel_text
c0100000 g       .text  00000000 _start
c0100000 l    d  .text  00000000 .text
c0100003 g       .text  00000000 _kernel_etext
c0101000 g       .bss   00000000 _kernel_bss
c0101000 l       .bss   00000000 stack_top
c0101000 l    d  .bss   00000000 .bss
c0102fa0 g       .bss   00000000 _kernel_ebss
c0102fa0 g       .bss   00000000 _kernel_end
c0102fa0 l       .bss   00000000 stack_bottom

My question is whether this is expected behaviour of ld. If yes, what is the problem with my example, because as far as I understand .bss is also a nobits section, but it produces the expected result?


Solution

  • Okay I figured it out.

    Apparently you're not supposed to have a comma right after the name of the section. objdump includes the comma in the name of the section so that clearly shows that that is the mistake.

    So

    section .bootstrap_stack, nobits
    

    should be

    section .bootstrap_stack nobits