I would like to understand better how the linker works when building c++ code.
If I define a function or a global variable in multiple cpp files, I get linker errors for multiple definitions. That makes sense, because I have multiple versions and the linker cannot decide on a particular one. To circumvent that, one only writes/includes the declaration, (signature only for functions, extern for variables). However, I have noticed that you CAN define methods in class declarations, and at least most here deem that acceptable or even good practice for trivial functions (like trivial getters and setters), because it allows the compiler to inline these functions (and also, it is necessary for templates).
In the discussion around "pragma once", I got that in some situations, the toolchain will not be able to distinguish if a file is the same or not, so in principle, it could happen that two cpp files get the same class name declared from different headers, but with different definitions of such header-only methods, couldn't it?
I have tried to set up an example: main.cpp
#include <iostream>
#include "Class1.hpp"
#include "Class2.hpp"
using namespace std;
int main() {
Class1 c1;
Class2 c2(c1);
c1.set(1);
cout << c1.get() << endl;
c2.print();
return 0;
}
Class1.hpp:
#ifndef CLASS1_HPP
#define CLASS1_HPP
#warning Class1
class Class1 {
public:
void set(int i) { val = i; };
int get() {return val;};
int val=0;
};
#endif
Class1a.hpp
#ifndef CLASS1_HPP
#define CLASS1_HPP
#warning Class1a
class Class1 {
public:
void set(int i) { val = i; };
int get() {return -1*val;};
int val=0;
};
#endif
Class2.hpp:
#pragma once
#ifndef CLASS2_HPP
#define CLASS2_HPP
#include <iostream>
#include "Class1a.hpp"
using namespace std;
class Class2 {
public:
Class2(Class1 &c1) : c1(c1) {};
void print();
Class1& c1;
};
#endif
Class2.cpp
#include "Class2.hpp"
void Class2::print() {
cout << c1.get() << endl;
}
However, I get the following output:
$ g++ *.cpp; ./a.out
In file included from Class2.hpp:6:0,
from Class2.cpp:1:
Class1a.hpp:4:2: warning: #warning Class1a [-Wcpp]
#warning Class1a
^~~~~~~
-1
-1
I don't quite get why Class1(not-a) is never seen by the precompiler despite the fact that it is included first in main.cpp, so I guess my question extends to that ... [Edit: I cannot reproduce the precompiler issue anymore, this now produces the same result as the code below, as I expected originally]
Edit: removed pragma once to avoid forther confusion and deviations.
Ok, since people seem to get this mixed up, here is what I would have expected the result of the precompiler to be:
main.cpp:
#include <iostream>
using namespace std;
class Class1 {
public:
void set(int i) { val = i; };
int get() {return val;}; // <-- This line is different!
int val=0;
};
class Class2 {
public:
Class2(Class1 &c1) : c1(c1) {};
void print();
Class1& c1;
};
int main() {
Class1 c1;
Class2 c2(c1);
c1.set(1);
cout << c1.get() << endl;
c2.print();
return 0;
}
Class2.cpp:
#include <iostream>
using namespace std;
class Class1 {
public:
void set(int i) { val = i; };
int get() {return -1*val;};
int val=0;
};
class Class2 {
public:
Class2(Class1 &c1) : c1(c1) {};
void print();
Class1& c1;
};
void Class2::print() {
cout << c1.get() << endl;
}
No idea why the precompiler thing before did not work. Maybe someone cares to explain, despite the fact that it is not my main question. And, yes, of course I know that writing such code is a bad idea, I just want to know how it is dealt with. Completely academic question.
What I find now is that the output of the executable depends on the order in which I state the cpp files for g++:
$ g++ main.cpp Class2.cpp
$ ./a.out
1
1
$ g++ Class2.cpp main.cpp
$ ./a.out
-1
-1
So at some point, the linker seems to grab the next best version of the method. Why does the same not seem to happen with functions and variables, and could it be avoided (because this seems like something that should at least produce a warning)?
Additional example with functions. main.cpp
#include <iostream>
using namespace std;
int get() {return 1;}
void print();
int main() {
cout << get() << endl;
print();
}
method2.cpp
int get() { return -1; }
void print() {
cout << get() << endl;
}
Here, the multiple definition gets caught:
$ g++ main.cpp method2.cpp
/tmp/ccjCKBLm.o: In function `get()':
method2.cpp:(.text+0x0): multiple definition of `get()'
/tmp/ccnvH0iR.o:main.cpp:(.text+0x0): first defined here
/tmp/ccnvH0iR.o: In function `main':
main.cpp:(.text+0x38): undefined reference to `print()'
collect2: error: ld returned 1 exit status
If I add inline to the functions, then it compiles again, but always returns 1 despite the order of the arguments of g++, which is in line (no pun intended) with the winning answer below.
Looking at [basic.def.odr]/10
, we have:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program outside of a discarded statement; no diagnostic required. [...]
This makes having multiple definitions of a (non-inline) function or variable an ODR violation. The linker is not required to diagnose this, but since it is usually easy to do so, you will frequently see this diagnosed.
Then we have [basic.def.odr]/12
:
There can be more than one definition of a
[...]
- inline function [...]
[...]
in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. [...] Given such an entity named
D
defined in more than one translation unit, then
- each definition of
D
shall consist of the same sequence of tokens, [...][...]
[...] If the definitions of
D
do not satisfy these requirements, then the program is ill-formed, no diagnostic required.
Your Class1::get
method violates this. It is implicitly inline
(because it is defined in a class definition, see [dcl.inline]/4
- the above rules are also summarized in that section), so having multiple definitions is allowed, but they do not consist of the same token sequence.
Once again, no diagnostic is required. Checking the consistency of multiple definitions of inline functions (and all the other things that I skipped in the above quotes) is not feasible for the linker, so it generally makes no attempt to do so.
in principle, it could happen that two cpp files get the same class name declared from different headers, but with different definitions of such header-only methods, couldn't it?
That could happen, yes, and it would be an ODR violation, making the program ill-formed, no diagnostic required. The use of include guard macros is a solid countermeasure.
Note that this isn't limited to different headers declaring the same class, but also e.g. the same header being included with different #define
situations so that the preprocessed definitions differ between inclusions of the same header file.