Search code examples
assemblyx86bootloadergnu-assemblerosdev

How to tell the GNU assembler to warn when using undefined labels?


Is there any way to tell the GNU assembler to warn on using undefined labels on compilation time?

Let's suppose I made a typo:

jmp MyLabell

MyLabel:

I don't use a linker. I use as to generate an object file, and then I objcopy to convert the .o file to a raw binary. I do this because I intend to run it in an environment that can only execute raw binaries (a real mode bootloader). I thought the assembler would warn me if I used an undefined label when assembling the source code

It will compile just fine, but since MyLabell was never defined, it will always translate to address 0, leaving the programmer clueless and oblivious. Can as be told not to ignore such problems? If not, is there a reason it's not possible? As far as I can remember (correct me if I'm wrong), NASM does care that I use only defined labels.

Version of the GNU assembler I am using: GNU assembler (GNU Binutils) 2.29.1

I've already spent a whole night debugging my code just to find out that after renaming a label, I did not change all references to it.


Solution

  • This is in essence an XY problem.

    To answer the original question GNU Assembler as assumes that any label that it can't find in the current file is in another file that will be resolved by a linker at link time. It places a dummy value as the jump target that is resolved by the linker.


    The issue comes down to the fact that you never found out if the label was undefined because you didn't run it through a linker to generate an executable. Converting a raw object file to a flat binary may not work as expected. To fix this:

    • Use the GNU assembler to generate an object file(s).
    • Use the linker LD to set the origin point
    • Use OBJCOPY to convert the final linked executable to binary.

    If you are creating a real mode bootloader, do not use the .org directive in GNU Assembler. It doesn't do what you expect it to. This is different that the org directive that NASM uses when generating raw binary files directly. You could use something like:

    as --32 boot.s -o boot.o
    ld -melf_i386 -nostdlib -Ttext=0x7c00 boot.o -o boot.elf
    objcopy -O binary boot.elf boot.bin
    

    With the LD command in the example you can specify any number of .o files to link together or just one object file if you wish.


    As an addendum to the last section I prefer to use a linker script with LD. For bootloaders I use the linker to place the boot signature in the appropriate place (it can be removed from your assembly file), and sets the origin point to 0x7c00. This is a very simple one that assumes your bootloader only uses a .text, .data or even an .rodata section:

    File link.ld:

    OUTPUT_FORMAT("elf32-i386");
    ENTRY(start);
    SECTIONS
    {
        . = 0x7C00;
        .text : {
            *(.text);
        }
        .data : {
            *(.data);
            *(.rodata);
        }
    
        /* Boot signature */
        .sig : AT(0x7DFE) {
            SHORT(0xaa55);
        }
    
        /* Discard common unwanted/unneeded sections */
        /DISCARD/ : {
            *(.comment);
            *(.note.gnu.build-id);
        }
    }
    

    Then assemble and link the file. In this case we specify -Tlink.ld to use the linker script above, and we no longer have to use -Ttext=0x7c00:

    as --32 boot.s -o boot.o
    ld -melf_i386 -nostdlib -Tlink.ld boot.o -o boot.elf
    objcopy -O binary boot.elf boot.bin
    

    Properly using a linker to generate executables should produce an error similar to this if a label like MyLabel1 was not found:

    boot.o:(.text+0x1): undefined reference to `MyLabell'


    Origin points and Real Mode code

    The GNU Linker has no understanding of real mode 20-bit segment:offset addressing. A bootloader will be loaded at physical address 0x07c00, but there is more than one way to address that location. In real mode segment:offset addressing the segment and the offset combine to define the physical address. The calculation is segment * 16 + offset. The origin point you choose in the linker script or the .Ttext= option must combine with the segment you load into the segment registers (especially DS) to be 0x07c00. If you set the segments to 0x0000 then the offset you need is 0x7c00 because 0x0000 * 16 + 0x7c00 = 0x07c00. Using a segment of 0x7c0 you'd need an offset of 0x0000 as 0x7c0 * 16 + 0x0000 = 0x07c00.

    My linker script link.ld assumed you loaded the DS segment register with 0x0000. The value you used for the segments was 0x7c0 so you need to change the link.ld to use . = 0x0000; instead of . = 0x7C00;.

    If you use . = 0x0000 as an origin point then you also need to adjust the bootloader location by subtracting 0x7c00 from it. The line .sig : AT(0x7DFE) { would have to be changed to .sig : AT(0x1FE) {.If you don't use a linker script and specify the origin point when running LD then it would have to be changed to -Ttext=0x0000.