Search code examples
cgccassemblyclang

Remove stack frame setup/initilization


I have the following program:

void main1() {
    ((void(*)(void)) (0xabcdefabcdef)) ();
}

I create it with the following commands:

clang -fno-stack-protector -c -static -nostdlib -fpic  -fpie -O0 -fno-asynchronous-unwind-tables main.c -o shellcode.o
ld shellcode.o -o shellcode -S -static -dylib -e main1 -order_file order.txt
gobjcopy -O binary --only-section=.text shellcode shellcode.output

The assembly looks like the following:

                             //
                             // ram 
                             // ram: 00000000-00000011
                             //
                             **************************************************************
                             *                          FUNCTION                          *
                             **************************************************************
                             undefined FUN_00000000()
             undefined         AL:1           <RETURN>
                             FUN_00000000
        00000000 55              PUSH       RBP
        00000001 48 89 e5        MOV        RBP,RSP
        00000004 48 b8 ef        MOV        RAX,0xabcdefabcdef
                 cd ab ef 
                 cd ab 00 00
        0000000e ff d0           CALL       RAX
        00000010 5d              POP        RBP
        00000011 c3              RET

How do I get clang to remove the PUSH RBP, MOV RBP,RSP and POP RBP instructions as they are unnecessary?

I can do this if I write the program in assembly with the following lines:

.globl start
start:
    movq $0xabcdefabcdef, %rax
    call *%rax
    ret

and with the following build commands:

clang  -static -nostdlib main.S -o crashme.o
gobjcopy -O binary --only-section=.text crashme.o crashme.output

and the resulting assembly:

                             //
                             // ram 
                             // ram: 00000000-0000000c
                             //
             assume DF = 0x0  (Default)
        00000000 48 b8 ef        MOV        RAX,0xabcdefabcdef
                 cd ab ef 
                 cd ab 00 00
        0000000a ff d0           CALL       RAX
        0000000c c3              RET

but I would much rather write C code instead of assembly.


Solution

  • You forgot to enable optimization. Any optimization level like -O3 enables -fomit-frame-pointer.

    It will also optimize the tailcall into a jmp instead of call/ret though. If you need to avoid that for some reason, maybe you can use -fomit-frame-pointer at the default -O0.

    For shellcode you might want -Os to optimize for code size. Or even clang's -Oz; that will have a side-effect of avoiding some 0 bytes in the machine code by using push imm8 / pop reg to put small constants in registers, instead of mov reg, imm32.