Search code examples
cfunctionexternfunction-definitionlinkage

C - One-definition rule for functions


I'm new to C and have read that each function may only be defined once, but I can't seem to reconcile this with what I'm seeing in the console. For example, I am able to overwrite the definition of printf without an error or warning:

#include <stdio.h>

extern int printf(const char *__restrict__format, ...) {
    putchar('a');
}

int main() {
    printf("Hello, world!");
    return 0;
}

So, I tried looking up the one-definition rule in the standard and found Section 6.9 (5) on page 155, which says (emphasis added):

An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier delared with external linkage is used in an expression [...], somewhere in the entire program there shall be exactly one external definition for that identifier; otherwise, there shall be no more than one.

My understanding of linkage is very shaky, so I'm not sure if this is the relevant clause or what exactly is meant by "entire program". But if I take "entire program" to mean all the stuff in <stdio.h> + my source file, then shouldn't I be prohibited from redefining printf in my source file since it has already been defined earlier in the "entire program" (i.e. in the stdio bit of the program)?


My apologies if this question is a dupe, I couldn't find any existing answers.


Solution

  • The C standard does not define what happens if there is more than one definition of a function.

    … shouldn't I be prohibited…

    The C standard has no jurisdiction over what you do. It specifies how C programs are interpreted, not how humans may behave. Although some of its rules are written using “shall,” this is not a command to the programmer about what they may or may not do. It is a rhetorical device for specifying the semantics of C programs. C 2018 4 2 tells us what it actually means:

    If a “shall” or “shall not” requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined…

    So, when you provide a definition of printf and the standard C library provides a definition of printf, the C standard does not specify what happens. In common practice, several things may happen:

    • The linker uses your printf. The printf in the library is not used.
    • The compiler has built-in knowledge of printf and uses that in spite of your definition of printf.
    • If your printf is in a separate source module, and that module is compiled and inserted into a library, then which printf the program uses depends on the order the libraries are specified to the linker.

    While the C standard does not define what happens if there are multiple definitions of a function (or an external symbol in general), linkers commonly do. Ordinarily, when a linker processes a library file, its behavior is:

    • Examine each module in the library. If the module defines a symbol that is referenced by a previously incorporated object module but not yet defined, then include that module in the output the linker is building. If the module does not define any such symbol, do not use it.

    Thus, for ordinary functions, the behavior of multiple definitions that appear in library files is defined by the linker, even though it is not defined by the C standard. (There can be complications, though. Suppose a program uses cos and sin, and the linker has already included a module that defines cos when it finds a library module that defines both sin and cos. Because the linker has an unresolved reference to sin, it includes this library module, which brings in a second definition of cos, causing a multiple-definition error.)

    Although the linker behavior may be well defined, this still leaves the issue that compilers have built-in knowledge about the standard library functions. Consider this example. Here, I added a second printf, so the program has:

    printf("Hello, world!");
    printf("Hello, world!\n");
    

    The program output is “aHello, world.\n”. This shows the program used your definition for the first printf call but used the standard behavior for the second printf call. The program behaves as if there are two different printf definitions in the same program.

    Looking at the assembly language shows what happens. For the second call, the compiler decided that, since printf("Hello, world!\n"); is printing a string with no conversion specifications and ending with a new-line character, it can use the more-efficient puts routine instead. So the assembly language has call puts for the second printf. The compiler cannot do this for the first printf because it does not end with a new-line character, which puts automatically adds.