Search code examples
cmemory-managementelfmemory-addressobjdump

How can I access interpreter path address at runtime in C?


By using the objdump command I figured that the address 0x02a8 in memory contains start the path /lib64/ld-linux-x86-64.so.2, and this path ends with a 0x00 byte, due to the C standard.

So I tried to write a simple C program that will print this line (I used a sample from the book "RE for beginners" by Denis Yurichev - page 24):

#include <stdio.h>

int main(){
    printf(0x02a8);
    return 0;
}

But I was disappointed to get a segmentation fault instead of the expected /lib64/ld-linux-x86-64.so.2 output.

I find it strange to use such a "fast" call of printf without specifiers or at least pointer cast, so I tried to make the code more natural:

#include <stdio.h>

int main(){
    char *p = (char*)0x02a8;
    printf(p);
    printf("\n");
    return 0;
}

And after running this I still got a segmentation fault.

I don't believe this is happening because of restricted memory areas, because in the book it all goes well at the 1st try. I am not sure, maybe there is something more that wasn't mentioned in that book.

So need some clear explanation of why the segmentation faults keep happening every time I try running the program.

I'm using the latest fully-upgraded Kali Linux release.


Solution

  • It no longer works like that. The 64-bit Linux executables that you're likely using are position-independent and they're loaded into memory at an arbitrary address. In that case ELF file does not contain any fixed base address.

    While you could make a position-dependent executable as instructed by Marco Bonelli it is not how things work for arbitrary executables on modern 64-bit linuxen, so it is more worthwhile to learn to do this with position-independent ones, but it is a bit trickier.

    This worked for me to print ELF i.e. the elf header magic, and the interpreter string. This is dirty in that it probably only works for a small executable anyway.

    #include <stdio.h>
    #include <stdlib.h>
    #include <inttypes.h>
    
    int main(){
        // convert main to uintptr_t
        uintptr_t main_addr = (uintptr_t)main;
    
        // clear bottom 12 bits so that it points to the beginning of page
        main_addr &= ~0xFFFLLU;
    
        // subtract one page so that we're in the elf headers...
        main_addr -= 0x1000;
    
        // elf magic
        puts((char *)main_addr);
    
        // interpreter string, offset from hexdump!
        puts((char *)main_addr + 0x318);
    }
    

    There is another trick to find the beginning of the ELF executable in memory: the so-called auxiliary vector and getauxval:

    The getauxval() function retrieves values from the auxiliary vector, a mechanism that the kernel's ELF binary loader uses to pass certain information to user space when a program is executed.

    The location of the ELF program headers in memory will be

    #include <sys/auxv.h>
    char *program_headers = (char*)getauxval(AT_PHDR);
    

    The actual ELF header is 64 bytes long, and the program headers start at byte 64 so if you subtract 64 from this you will get a pointer to the magic string again, therefore our code can be simplified to

    #include <stdio.h>
    #include <inttypes.h>
    #include <sys/auxv.h>
    
    
    int main(){
        char *elf_header = (char *)getauxval(AT_PHDR) - 0x40;
        puts(elf_header + 0x318); // or whatever the offset was in your executable
    }
    

    And finally, an executable that figures out the interpreter position from the ELF headers alone, provided that you've got a 64-bit ELF, magic numbers from Wikipedia...

    #include <stdio.h>
    #include <inttypes.h>
    #include <sys/auxv.h>
    
    
    int main() {
        // get pointer to the first program header
        char *ph = (char *)getauxval(AT_PHDR);
    
        // elf header at this position
        char *elfh = ph - 0x40;
    
        // segment type 0x3 is the interpreter;
        // program header item length 0x38 in 64-bit executables
        while (*(uint32_t *)ph != 3) ph += 0x38;
    
        // the offset is 64 bits at 0x8 from the beginning of the 
        // executable
        uint64_t offset = *(uint64_t *)(ph + 0x8);
    
        // print the interpreter path...
        puts(elfh + offset);
    }