On this site I read:
class MyClass;
simply states that "there is such a class" and its full definition will be "coming later" (either in the current file, at compile time, or from some other file at link time)
I'm not sure If I understand this process at the link time. I wrote the code below that should demonstrates it. Please if I'm wrong, correct me. I'm not sure how forward declaration at link time works.
//first.h
-----------
class Second;
class First{
public:
Second* ptr;
First();
};
//first.cpp
-----------
#include "first.h"
extern Second second;
First::First(){ptr = &second;}
//second.h
----------
class Second{
public:
Second(){};
};
//main.cpp
----------
#include "second.h"
Second second;
int main(int argc, char *argv[])
{
return 0;
}
This code is compiled. If the line Second second; is commented, linker throws: undefined reference to 'second'. Some comment putting together 1) forward declaration 2) compilation unit 3) linking might be helpful.
I think the documentation you've read has mislead you by its laxity:
class MyClass;
doesn't exactly mean there is such a class, because the only way to make a class exist is to define it, and a declaration is not a definition. The declaration would be better read as: Assume there is such a class.
And it doesn't mean that full definition of the class will, or will not, be coming later. It's full definition might need to come later for successful compilation. Or not. And if the full class definition does need to come later, it will need to come for successful compilation; therefore at compiletime, not linktime.
The undefined reference linkage error that you are able to provoke
by commenting out Second second;
in main.cpp
is simply a
plain old undefined reference error such as you'll always get
by trying to link a program in which a variable declared extern
is referenced somewhere and defined nowhere. It has no essential
connection with the extern
variable being of class type - rather
than, say, int
- or with the business of forward class declaration.
Forward declaration of classes is only ever necessary to preempt a deadlock when the compiler attempts to parse the definitions of of two classes that are interdependent and is unable to complete either class definition before it completes the other one.
An elementary example: I naively write two classes first
and second
, of which
each has a method that uses an object of the other class and calls
one of its methods:
first.h
#ifndef FIRST_H
#define FIRST_H
#include <string>
#include <iostream>
#include "second.h"
struct first {
std::string get_type() const {
return "First";
}
void use_a_second(second const & second) const {
std::cout << second.get_type() << std::endl;
}
};
#endif
second.h
#ifndef SECOND_H
#define SECOND_H
#include <string>
#include <iostream>
#include "first.h"
struct second {
std::string get_type() const {
return "First";
}
void use_a_first(first const & first) const {
std::cout << first.get_type() << std::endl;
}
};
#endif
main.cpp
#include "first.h"
#include "second.h"
int main()
{
first f;
second s;
f.use_a_second(s);
s.use_a_first(f);
return 0;
}
Try to compile main.cpp
:
$ g++ -c -o main.o -Wall -Wextra -pedantic main.cpp
In file included from first.h:6:0,
from main.cpp:1:
second.h:13:19: error: ‘first’ has not been declared
void use_a_first(first const & first) const {
^~~~~
second.h: In member function ‘void second::use_a_first(const int&) const’:
second.h:14:22: error: request for member ‘get_type’ in ‘first’, which is of non-class type ‘const int’
std::cout << first.get_type() << std::endl;
^~~~~~~~
main.cpp: In function ‘int main()’:
main.cpp:9:8: error: expected unqualified-id before ‘.’ token
second.use_a_first(first);
The compiler is stymied, because first.h
includes second.h
, and
vice versa, so it can't get the definition of first
before it
gets the definition of second
, which requires the definition of first
...
and vice versa.
A forward declaration of each class before the definition of the other one, and a correspending refactoring of each class into a definition and an implementation, gets us out of this deadly embrace:
first.h (fixed)
#ifndef FIRST_H
#define FIRST_H
#include <string>
struct second; // Declaration
struct first{
std::string get_type() const {
return "first";
}
void use_a_second(second const & second) const;
};
#endif
second.h (fixed)
#ifndef SECOND_H
#define SECOND_H
#include <string>
struct first; //Declaration
struct second{
std::string get_type() const {
return "second";
}
void use_a_first(first const & first) const;
};
#endif
first.cpp (new)
#include <iostream>
#include "first.h"
#include "second.h"
void first::use_a_second(second const & second) const {
std::cout << second.get_type() << std::endl;
}
second.cpp (new)
#include <iostream>
#include "first.h"
#include "second.h"
void second::use_a_first(first const & first) const {
std::cout << first.get_type() << std::endl;
}
Compile:
$ g++ -c -o first.o -Wall -Wextra -pedantic first.cpp
$ g++ -c -o second.o -Wall -Wextra -pedantic second.cpp
$ g++ -c -o main.o -Wall -Wextra -pedantic main.cpp
Link:
$ g++ -o prog main.o first.o second.o
Run:
$ ./prog
second
first
This is the only scenario for which forward class declaration is needed. It can be used in wider circumstances: see When can I use a forward declaration?. The need is only every a need for successful compilation, not linkage. Linkage can't be attempted till compilation succeeds.
The documentation snippet is also misleadingly imprecise in the use of the word definition. The definition of a class means one thing in the context of compilation and that's what it should mean in the interest of clarity. It means something else, loosely, in the context of linkage and it shouldn't mean that in the interest of clarity. In the context of linkage, we'd better only talk about the implementation of a class - and even that is a notion that begs for qualification.
As far as the compiler is concerned a class is defined if it gets from the start to the end of:
class foo ... {
...
};
without error, and then the class definition is the contents of that span. A complete definition does not mean, of course, that a class has a complete implementation. It only has that if, in addition to a complete definition, all the methods and static members that are declared in its definition are also themselves defined somewhere, either in-line within the class definition; out-of-line in a containing translation unit, or in other translation units (possibly compiled in external libraries) with which the compiled containing translation unit gets linked. If any of those member definitions are not provided in one of those ways come link-time, an unresolved reference linkage error will result. That is a deficit of the class implementation.
The linker's idea of definition is different from the C++ compiler's and more elementary. From the linker's point of view, a C++ class doesn't actually exist. For the linker, the class implementation is boiled down, by the compiler, to a bunch of symbols and symbol definitions not essentially different from what it gets from any language compiler, whether or not the language deals in classes at all. What matters to the linker, for success, is that all the symbols that are referenced in the output binary have definitions either in the same binary or in dynamic libraries requested in the linkage. A symbol (broadly) can identify some executable code or some data. For a code symbol, definition means implementation to the linker: the definition is the represented code, if any. For a data symbol, definition means value to the linker: it means the represented data, if any.
So when the snippet says:
.. and its full definition will be "coming later" (either in the current file, at compile time, or from some other file at link time)
this needs to picked apart.
The full definition of class foo
must be come later in the compilation of
a translation unit, before type foo
is required as the type of anything else,
specifically, the type of a base class, or function/method argument, or object1.
If this requirement is not satisfied a compile error will result:-
If foo
is never required later to be the type of a base class, argument or object,
then the definition of class foo
need never follow the declaration.
The full implementation of class foo
may or may not be required, or
provided, by the linkage. Since the linker doesn't know about classes,
it doesn't know any distinction between a full implementation of a class from an incomplete one.
You can change class first
, above, by adding a method that has no implementation:
struct first{
std::string get_type() const {
return "first";
}
void use_a_second(second const & second) const;
void unused();
};
and the program will compile, link and run just the same. Since the
compiler emits no definition of void first::unused()
, and since
the program does not attempt to invoke void first::unused()
on
any object of type first
, or to use its address, no mention of
void first::unused()
appears in the linkage at all. If
we change main.cpp
to:
#include "first.h"
#include "second.h"
int main()
{
first f;
second s;
f.use_a_second(s);
s.use_a_first(f);
f.unused();
return 0;
}
Then the linker will find a call to void first::unused()
in main.o
and of course give an unresolved reference error. But this just
means that the linkage fails to provide an implementation that the
program needs. It doesn't mean that the class definition of
first
is incomplete. If it was, compilation of main.cpp
would have
failed, and no linkage would have been attempted.
Takeway:-
Forward class declaration can avert compiletime deadlock of mutually dependent class definitions, with consequential refactoring.
A forward class declaration can't avert an unresolved reference linkage
error. Such an error always means that the implementation of
a code symbol, or the value of a data symbol, is needed by the program
and not provided by the linkage. A class declaration cannot add either
one of those things to the linkage. It adds nothing to the linkage. It
just directs the compiler to tolerate foo
in contexts
where where it is necessary and sufficient for foo
to be a class-name.
Linkage cannot provide any part of a class definition at linktime if, after a forward class declaration, the class definition becomes required, because a complete class definition will be required at compiletime or not at all. Linkage cannot provide parts of a class definition at all; only elements of the class implementation.
class foo;
foo & bar();
...
foo * pfoo;
...
foo & rfoo = bar();
can compile, with merely the declaration of class foo
, because neither
foo * pfoo
or foo & rfoo
requires an object of type foo
to exist:
a pointer-to-foo, or reference-to-foo, is not a foo
,
But:
class foo;
...
foo f; // Error
...
foo * pfoo;
...
pfoo->method(); // Error
can't compile, because f
must be a foo
, and the object addressed by pfoo
must exist, and therefore be a foo
, if any method is invoked through that
object.