Search code examples
c++linkerheader-filesencapsulation

Why doesn't linker preserve encapsulation?


Lets assume the following header foo.h:

class Foo {
  private:
    void print() const;
};

and following foo.cpp:

#include <iostream>

#include "foo.h"

void Foo::print() const {
    std::cout << "Secret" << std::endl;
}

another header foo1.h, that is the same as foo.h unless method print is declared public:

class Foo {
  public:
    void print() const;
};

and this will be main.cpp, that just call print in foo1.h:

#include "foo1.h"

int main() {
    Foo f;
    f.print();
    return 0;
}

What seems strange for me is that the following linking gonna work:

g++ foo.cpp -c -o foo.o
g++ main.cpp -c -o main.o
g++ main.o foo.o -o exec
./exec

The last command will output:

Secret

So without knowing the concrete implementation of class Foo but, knowing its declaration and having its object file, we can create situation when its methods can be used even though they are declared private.

My questions are:

  1. Why does it work? Linker doesn't consider private and public declarations?

  2. Is this behavior useful in practice? If yes, how is it used? My guess that it could be useful for testing.


Solution

  • First off, since you're violating the "One Definition Rule" (C++11 3.2/5 "One definition rule" says that separate class definitions in different translations units must "consist of the same sequence of tokens"), anything goes as far as the toolchain is concerned. it could diagnose an error, or produce a program that appears to work (as in your test).

    A simple reason why your experiment produces the results that you see is that the access to a class member is 'enforced' by the compiler, and you have told the compiler that the the access to member Foo::print() is public.

    It is conforming for the toolchain to encode the access for a member in the name mangle that is performed for other reasons (such as overloading). However, since the standard doesn't require that the toolchain enforce it, it seems that implementers decided that they didn't need to account for access control at link time. In other words, I think that it would be feasible to encode access control into the external symbol that the linker uses, but that work wasn't done; probably because it's not necessary strictly speaking.

    Note that Microsoft C++ does incorporate the access to a member in the external name, so you do get a link time error:

    testmain.obj : error LNK2019: unresolved external symbol "public: void __thiscall Foo::print(void)const " (?print@Foo@@QBEXXZ) referenced in function _main testmain.exe : fatal error LNK1120: 1 unresolved externals
    

    Here are the symbols g++ produces (along with a c++filt decode):

    D:\so-test>nm test.o | grep Foo
    000000000000008c t _GLOBAL__sub_I__ZNK3Foo5printEv
    0000000000000000 T _ZNK3Foo5printEv
    
    D:\so-test>nm testmain.o | grep Foo
                     U _ZNK3Foo5printEv
    
    D:\so-test>c++filt _ZNK3Foo5printEv
    Foo::print() const
    

    And here are the symbols MS C++ produces (along with a decode):

    D:\so-test>dumpbin /symbols test.obj | grep Foo
    22D 00000000 SECTBA notype ()    External     | ?print@Foo@@ABEXXZ (private: void __thiscall Foo::print(void)const )
    
    D:\so-test>dumpbin /symbols testmain.obj | grep Foo
    009 00000000 UNDEF  notype ()    External     | ?print@Foo@@QBEXXZ (public: void __thiscall Foo::print(void)const )