I read here that the extern
keyword can be used in combination with an initialization which would be translated into an actual definition according to C standard.
First of all I couldn't really find an actual passage defining this specific condition in the current C11 standard (draft). Page 158ff only gives examples without initialization.
Further on, when I try to compile the following:
testfile.h
extern int var1=10;
void testFcn(void);
testfile.c
#include "testfile.h"
void testFcn(void){
int var3 = var1;
}
main.c
#include "testfile.h"
void main(void){
testFcn();
}
..my compiler (gcc/5.4.1) warns me about the following:
testfile.h:1:12: warning: ‘var1’ initialized and declared ‘extern’
extern int var1=10;
^
In file included from testfile.c:1:0:
testfile.h:1:12: warning: ‘var1’ initialized and declared ‘extern’
extern int var1=10;
^
And the linker throws an error confirming that there's a duplicate definition:
/tmp/ccE8M7S0.o:(.bss+0x0): multiple definition of `var1'
/tmp/cc7OrQEI.o:(.bss+0x0): first defined here
collect2: error: ld returned 1 exit status
I understand the compiler warnings but not the linker error. Shouldn't the linker replace the testfile reference with the object code of the very same file? I know how to implement it in a better way (i.e. defining objects only in source files) but I want to understand why this specific arrangement won't work.
Conclusion from the discussion below:
My main confusion was that I expected the pre-processor and the linker to pass this kind of information onto each other where certain object definitions are coming from. Now I realize it's nonsense but I thought the linker should have gotten the information from the pre-processor that the variable var
was defined in testfile.h. In other words the linker was supposed to merge those two definitions. But that's what the static
keyword is for.
Thank you to all who were helping to clear that up.
Edit1: Changed initialization value to 10 since initialization to 0 seemed to distract from the actual problem. And pointed out that doing it differently would be definitely be the way to go to solve the problem but I'd like to understand it completely, first.
Edit2: Adding conclusion.
The standard specifies:
If the declaration of an identifier for an object has file scope and an initializer, the declaration is an external definition for the identifier.
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier
static
, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
(C2011, 6.9.2/1-2)
Thus, if your header contains
extern int var1=10;
then every file that includes it contains an (external) definition of var1
. Furthermore, if it contains just
int var1;
, and there is no other file-scope declaration of var1
that designates it extern
in the translation unit, then that translation unit also contains a definition of var1
. If there is no declaration designating it static
, then that declaration furthermore is an external declaration, because external linkage is the default for file-scope declarations.
But the standard specifies that:
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a
sizeof
or_Alignof
operator whose result is an integer constant), [then] somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.
(C2011, 6.9/5; emphasis added)
Thus, if you put an external definition of a variable in a header file (as in your example), and include that header in more than one source file contributing to the same program or library, then you violate a constraint of the standard.
C does not specify particular behaviors for the innumerable ways in which a program can fail to conform, so what the linker actually does with such code is a question of implementation detail. In many cases, however, if there is an external definition of an object in a given translation unit, then the compiler will allocate storage and associate an externally-visible symbol with it in the corresponding object file.
When a linker is faced with two or more object files containing identical strong symbols, it has a conundrum: which does it use? Some error out. Some, under certain circumstances, merge the symbols and the objects to which they refer.
Shouldn't the linker replace the testfile reference with the object code of the very same file?
There's no "should" or "should not" with respect to non-conforming code. Moreover, the standard specifies:
In the set of translation units and libraries that constitutes an entire program, each declaration of a particular identifier with external linkage denotes the same object or function.
(C2011, 6.2.2/2)
So no, it is not reasonable to suppose that the linker should just choose the object defined in the same translation unit, though it is conceivable that indeed some do so. But if that's what you want then you should declare the object with internal linkage -- that is, declare it with the static
storage-class specifier. In that case, the declaration generally should not appear in a header at all, as that would give every translation unit that includes the header its own copy of the variable, which is not usually wanted.
For the record, if you want to provide an external variable then the way to do so is with an external declaration that is not a definition in a header file:
extern int foo;
combined with a definition in exactly one source file, for example:
extern int foo = 0;