Search code examples
c++forward-declaration

Concept: How are declarations linked to apropriate definitions


How exactly does a header file or any forward declarations know which definition it is referring to?

I understand that .cpp files are compiled independently, and we need a header file or forward declaration to access members of another .cpp file. But when we declare a member, we don't explicitly tell the compiler where to get the definition from.

Here is a case that I can think of: Say I have two cpp files 'one.cpp' and 'two.cpp'. Both 'one.cpp' and 'two.cpp' have a member 'int func(int x)' that have different implementations (but have the exact name and format). If we have a header file or declaration of this function somewhere outside these two files, how does the compiler know which definition to take?


Solution

  • If we have a header file or declaration of this function somewhere outside these two files, how does the compiler know which definition to take?

    It's the linker which takes one or more object files or libraries as input and combines them to produce an executable file. In doing so, it resolves references to external symbols i.e. it looks for definition of all external functions and global variables both from other '.obj' files and external libraries, assigns final addresses to procedures/functions and variables, and revises code and data to reflect new addresses.

    Let's consider the example you are mentioning in question:

    Say I have two cpp files 'one.cpp' and 'two.cpp'. Both 'one.cpp' and 'two.cpp' have a member 'int func(int x)' that have different implementations....

    Say, one.cpp:

    int func(int x)
    {
            return x+1;
    }
    

    and two.cpp:

    int func(int x)
    {
            return x+2;
    }
    

    and a header file declaring func() function, say myinc.h:

    int func(int x);
    

    and main() which is calling func(), say main.cpp:

    #include <iostream>
    #include <myinc.h>
    
    int main()
    {
            int res;
            res = func(10);
            std::cout << res << std::endl;
            return 0;
    }
    

    I can create the object file of main.cpp because an object file can refer to symbols that are not defined.

    >g++ -I . -c main.cpp

    Now let's examine the object file main.o using nm command, the output is:

    Symbols from main.o:
    
    Name                  Value           Class        Type         Size             Line  Section
    
    _GLOBAL__I_main     |0000000000000078|   t  |              FUNC|0000000000000015|     |.text    
    _Z41__static_initialization_and_destruction_0ii|0000000000000038|   t  |              FUNC|0000000000000040|     |.text    
    _Z4funci            |                |   U  |            NOTYPE|                |     |*UND*
    .......
    .......<SNIP>
    

    The func() function Class is U, which means Undefined. The compiler doesn't mind if it could not find the definition of a particular function, it would just assume that the function was defined in another file.

    The linker, on the other hand, may look at multiple files and try to find references to the functions that weren't mentioned.

    So, when we try creating an executable from the object files one.o, two.o and main.o:

    >g++ two.o one.o main.o -o outexe
    one.o: In function `func(int)':
    one.cpp:(.text+0x0): multiple definition of `func(int)'
    two.o:two.cpp:(.text+0x0): first defined here
    collect2: ld returned 1 exit status
    

    Here you can see the linker is throwing multiple definition error for func() because it find two definitions of func().

    There is a one definition rule in c++, which states that:

    In the entire program, an object or non-inline function cannot have more than one definition; if an object or function is used, it must have exactly one definition. You can declare an object or function that is never used, in which case you don't have to provide a definition. In no event can there be more than one definition.

    So, the program is violating the ODR because it contains two definitions of the same function declaration.

    If we do not provide either one.o or two.o object file to linker, means, if we just provide one definition of func() it will generate exe:

    >g++ one.o main.o -o outexe
    

    and if we examine outexe, we get:

    Symbols from outexe:
    
    Name                  Value           Class        Type         Size             Line  Section
    
    .....<SNIP>    
    _Z41__static_initialization_and_destruction_0ii|000000000040082c|   t  |              FUNC|0000000000000040|     |.text
    _Z4funci            |00000000004007e4|   T  |              FUNC|000000000000000f|     |.text
    .......
    .......<SNIP>
    

    The symbol func() is of Type - FUNC and Class - T, which means - The symbol is in the text (code) section.