Search code examples
cfunctionlinkerfunction-pointersmemory-segmentation

In Which memory segment we shall find the memory address of a function in C


Function pointers points the address of a function in C. That means a function has a memory address and it will be part of any of the memory segments. I just printed address of function and is pointing to code segment. Does all functions written in C have their address in code segment?

#include <stdio.h>

void func()
{
        printf("hi!!!\n");
}

int main()
{
        void (*fptr)();
        fptr = func;
        printf("fptr pointing func addr : %p and &fptr : %p\n", fptr, &fptr);
        fptr();
        while (1) {
                sleep(1); /* Sleep for accessing procfs of the process */
        }
        return 0;
}

Program Output:

[revarath@bgl-vms-vm0251 basic]$ ./a.out &
[1] 20168
[revarath@bgl-vms-vm0251 basic]$ fptr pointing func addr : 0x4005ad and &fptr : 0x7fff301e6b80
hi!!!
c = 10

Finding PID to access procfs:

[revarath@bgl-vms-vm0251 basic]$ ps -ef | grep a.out
revarath 20168 24339  0 04:55 pts/2    00:00:00 ./a.out
revarath 20180 24339  0 04:56 pts/2    00:00:00 grep --color=auto a.out

Memory map output:

[revarath@bgl-vms-vm0251 basic]$ cat /proc/20168/maps
00400000-00401000 r-xp 00000000 00:2c 124964755                          /ws/revarath-bgl/backup/tests/basic/a.out
00600000-00601000 r--p 00000000 00:2c 124964755                          /ws/revarath-bgl/backup/tests/basic/a.out
00601000-00602000 rw-p 00001000 00:2c 124964755                          /ws/revarath-bgl/backup/tests/basic/a.out
.......
7fff301c8000-7fff301e9000 rw-p 00000000 00:00 0                          [stack]

My func addr : 0x4005ad lies within the memory region 00400000-00401000

My Cpu info: Multi core Intel

[revarath@bgl-vms-vm0251 basic]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8

Compiler:

gcc
gcc version 4.8.5 20150623 (Red Hat 4.8.5-36)

Is it correct to say memory address of function lies in OR points to code segment?


Solution

  • There are more language-lawyerly answers to be had here (like, don't rely on system implementations in your code and see disclaimer below), but basically, yes, that's how it works.

    All of the compiled code from your program, including the function you have illustrated with, is part of the code segment (also confusingly known as the "text" segment) in the resulting binary image. The OS's loader will splat those compiled code bytes into that memory segment within the process, and if you take the address of a static function this way, you'll see its corresponding code sitting somewhere in that segment. Same will be true of main, etc.


    The way a compiled program gets into memory and gets executed is a lot less magical than it might seem. Setting aside a modern OS's security mechanisms, etc, it kinda goes like this:

    1. Toolchain turns all your C code into a stream of executable bytes. Each compiled function necessarily starts at some particular offset into that stream of bytes.
    2. The "image" (executable file) contains different segments inside it: a code segment, a set of global variables with their initialized values, a segment indicating some initially-zero values that will need to appear on program start but that don't need to be included explicitly in the image, etc.
    3. The loader in the OS pulls apart the image files and thwacks the different parts into memory in the ways it wants to with appropriate executable/read/write flags, patches up the static references, and then jumps control to the first byte of your main function (by way of hidden-from-you C runtime stuff).

    When you get the address of your function, there's no magic there, it's a pointer to the first byte of the compiled instructions for that function.

    Professional Software Engineering Disclaimer As comments on this answer and the question indicate, in the real wild world there are umpteen variants on how linkers and loaders work; there are systems that don't really do it this way. Certainly you would never write broadly portable C code that made assumptions about memory segments. But for pedagogical purposes, if you're on one of the big name traditional desktoppy OSs, it *squints* basically works more or less something like you imagine. (If you can find an old copy of Tanenbaum's "Modern Operating Systems", even the old first edition, it will illuminate a lot of seemingly opaque topics of this vein.)