Search code examples
cassemblyshellcode

How does C store functions and when does it convert to machine code?


So I recently asked this question

I had to create an environment variable MYENV and store something in it such that I can successfully run this code.

#include <stdio.h>
#include <stdlib.h>

int main(){
            int (*func)();
            func = getenv("MYENV");
            func();
}

Earlier I was doing something like export MYENV=ls.

Which a user pointed out is incorrect as when the func() is called it basically tells C to run the instructions stored in the variable func which would be the string ls and is not a correct machine code. So I should pass some shellcode instead.

Now I want to know if this how it works for functions in general. As in when I declare a function let's say myFunction() which does let's say multiply 100 and 99 and returns the value, then the variable myFunction will point towards a set of machine instructions stored somewhere which multiplies 100 and 99 and returns the value.

And if I were to figure out those machine instructions and store them in a string and make myFunction point towards it, and then if I call myFunction() we'll have 9900 returned?

This is what I mean :

int (*myFunc)();
char *var = <machine_instructions_in_string_format>
int returnVar = myFunc();

Will the returnVar have 9900?

And if yes, how do I figure out what that string is?

I am having a hard time wrapping my head around this.


Solution

  • You have to fill the environment variable out with opcodes for your target machine. I made a little experiment:

    #include <stdio.h>
    #include <stdlib.h>
    
    int main(void) {
            int (*f)();
            f = getenv("VIRUS");
            (*f)();
            printf("Haha, it returned\n");
            return 0;
    }
    

    I compiled it, then used execstack:

    $ cc ge.c
    $ execstack -s ./a.out
    

    Then I wrote a bit of assembler:

    mov %rbp, %rsp
    pop %rbp
    ret
    

    Which mimics the function epilogue. Compiled it:

    $ cc -c t.s
    

    Looked at the opcodes:

    $ objdump -D t.o
    ...
       0:   48 89 ec                mov    %rbp,%rsp
       3:   5d                      pop    %rbp
       4:   c3                      retq   
    

    set the envar:

    $ export VIRUS=$(printf "\\x48\\x89\\xec\\x5d\\xc3")
    

    then ran the program:

    $ ./a.out
    

    And it said nothing, which is a clear indication that the printf line was stepped over. But, just to check, I tried:

    $ export VIRUS=$(printf "\\xc3")
    $ ./a.out
    Haha, it returned
    

    This was run on ubuntu-18.04 with an amd64 instruction set. If this happens to be a school assignment, you should aim for bonus points and figure out how you could get it to execute an opcode that contained a null (0) byte.