c++design-patterns multiple-inheritance virtual-inheritance pimpl-idiom

reduce size of object (wasted) in Multi virtual inheritance

After profiling, I found that a large portion of memory of my program are wasted by multi-virtual-inheritance.

This is MCVE to demostrate the problem ( http://coliru.stacked-crooked.com/a/0509965bea19f8d9 )

#include<iostream>
class Base{
    public: int id=0;  
};
class B : public virtual Base{
    public: int fieldB=0;
    public: void bFunction(){
        //do something about "fieldB"     
    }
};
class C : public virtual B{
    public: int fieldC=0;
    public: void cFunction(){
        //do something about "fieldC"     
    }
};
class D : public virtual B{
    public: int fieldD=0;
};
class E : public virtual C, public virtual D{};
int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; //4
    std::cout<<"B="<<sizeof(B)<<std::endl;       //16
    std::cout<<"C="<<sizeof(C)<<std::endl;       //32
    std::cout<<"D="<<sizeof(D)<<std::endl;       //32
    std::cout<<"E="<<sizeof(E)<<std::endl;       //56
}

I hope sizeof(E) to be not much more than 16 bytes (id+fieldB+fieldC+fieldD).
From experiment, if it is non virtual inheritance, E's size will be 24 (MCVE).

How to reduce size of E (by C++ magic, change program architecture, or design pattern)?

Requirement:-

Base,B,C,D,E can't be template class. It will cause circular dependency for me.
I must be able to call base class's function from derived class (if any) e.g. e->bFunction() and e->cFunction(), as usual.
However, it is OK if I can't call e->bField anymore.
I still want the ease of declaration.
Currently, I can declare "E inherit from C and D" as class E : public virtual C, public virtual D easily.

I am thinking about CRTP e.g. class E: public SomeTool<E,C,D>{}, but not sure how to make it works.

To make things easier :

In my case, every class is used like it is monolith, i.e. I will never cast object between types like static_cast<C*>(E*) or vise versa.
Macro is allowed, but discouraged.
Pimpl idiom is allowed. Actually, below is what I daydream.
Perhaps, I may be able to remove all virtual-inheritance.
However, with all the requirements, I can't find a way to code it.
In pimpl, if I make E virtual inherit from C & D, etc, all above requirement will be met, but I will still waste a lot of memory. :-

I am using C++17.

Edit

Here is a more correct description of my real-life problem.
I create a game that has many components e.g. B C D E.
All of them are created via pool. Thus, it enables fast iterating.
Currently, if I query every E from a game engine, I will be able to call e->bFunction().
In my most severe case, I waste 104 bytes per object in E-like class. (real hierarchy is more complex)

Edit 3

Let me try again. Here is a more meaningful class diagram.
I have a central system to assign hpPtr,flyPtr,entityId,componentId,typeId automatically already.
i.e. Don't worry how they are initialized.

In real case, dread diamond happen in many classes, this is the most simple case.

Currently, I call like :-

 auto hps = getAllComponent<HpOO>();
 for(auto ele: hps){ ele->damage(); }
 auto birds = getAllComponent<BirdOO>();
 for(auto ele: birds ){ 
     if(ele->someFunction()){
          ele->suicidalFly();
          //.... some heavy AI algorithm, etc
     }
 }

With this approach, I can enjoy cache coherence as in Entity Component System, and the cool ctrl+space intellisense of HpOO,FlyableOO and BirdOO like in Object-Oriented style.

Everything works fine - it just uses too much memory.

Solution

EDIT: based on latest update to the question and some chatting

Here's the most compact maintaining the virtual in all your classes.

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
};

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

class BaseComponent {
public: // or protected
    BaseFields data;
};
class HpOO : public virtual BaseComponent {
public:
    void damage() {
        hp[data.hpIdx] -= 1;
    }
};
class FlyableOO : public virtual BaseComponent {
public:
    void addFlyPower(float power) {
        flyPower[data.hpIdx] += power;
    }
};
class BirdOO : public virtual HpOO, public virtual FlyableOO {
public:
    void suicidalFly() {
        damage();
        addFlyPower(5);
    }
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 12
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 24
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 24
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 32
}

much smaller class size version dropping all the virtual class stuff:

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
};

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

class BaseComponent {
public: // or protected
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
protected:
    void damage() {
        hp[hpIdx] -= 1;
    };
    void addFlyPower(float power) {
        flyPower[hpIdx] += power;
    }
    void suicidalFly() {
        damage();
        addFlyPower(5);
    };
};
class HpOO : public BaseComponent {
public:
    using BaseComponent::damage;
};
class FlyableOO : public BaseComponent {
public:
    using BaseComponent::addFlyPower;
};
class BirdOO : public BaseComponent {
public:
    using BaseComponent::damage;
    using BaseComponent::addFlyPower;
    using BaseComponent::suicidalFly;
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 12
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 12
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 12
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 12
    // accessing example
    constexpr int8_t BirdTypeId = 5;
    BaseComponent x;
    if( x.typeId == BirdTypeId ) {
        auto y = reinterpret_cast<BirdOO *>(&x);
        y->suicidalFly();
    }
}

this example assumes your derived classes do not have overlapping functionalities with diverging effects, if you have those you have to add virtual functions to your base class for an extra overhead of 12 bytes (or 8 if you pack the class).

and quite possibly the smallest version still maintaining the virtuals

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
};

#define PACKED [[gnu::packed]]

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

vector<BaseFields> baseFields;

class PACKED BaseComponent {
public: // or protected
    int16_t baseFieldIdx{};
};
class PACKED HpOO : public virtual BaseComponent {
public:
    void damage() {
        hp[baseFields[baseFieldIdx].hpIdx] -= 1;
    }
};
class PACKED FlyableOO : public virtual BaseComponent {
public:
    void addFlyPower(float power) {
        flyPower[baseFields[baseFieldIdx].hpIdx] += power;
    }
};
class PACKED BirdOO : public virtual HpOO, public virtual FlyableOO {
public:
    void suicidalFly() {
        damage();
        addFlyPower(5);
    }
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 2
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 16 or 10
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 16 or 10
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 24 or 18
}

the first number is for unpacked structure, second packed

You can also pack the hpIdx and flyPowerIdx into the entityId using the union trick:

union {
    int32_t entityId{};
    struct {
    int16_t hpIdx;
    int16_t flyPowerIdx;
    };
};

in the above example if not using packing and moving the whole BaseFields structure into the BaseComponent class the sizes remain the same.

END EDIT

Virtual inheritance just adds one pointer size to the class, plus alignment of the pointer (if needed). You can't get around that if you actually need a virtual class.

The question you should be asking yourself is whether you actually need it. Depending on your access methods to this data that might not be the case.

Considering you need virtual inheritance but all common methods that need to be callable from all your classes you can have a virtual base class and use a bit less space than your original design in the following way:

class Base{
    public: int id=0;
    virtual ~Base();
    // virtual void Function();

};
class B : public  Base{
    public: int fieldB=0;
    // void Function() override;
};
class C : public  B{
    public: int fieldC=0;
};
class D : public  B{
    public: int fieldD=0;
};
class E : public  C, public  D{

};

int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; //16
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 24
    std::cout<<"D="<<sizeof(D)<<std::endl; // 24
    std::cout<<"E="<<sizeof(E)<<std::endl; // 48
}

In the case that there are cache misses but the CPU still has power to process the results you can furter decrease the size by using compiler-specific instructions to make the data structure as small as possible (next example works in gcc):

#include<iostream>

class [[gnu::packed]] Base {
    public:
    int id=0;
    virtual ~Base();
    virtual void bFunction() { /* do nothing */ };
    virtual void cFunction() { /* do nothing */ }
};
class [[gnu::packed]] B : public Base{
    public: int fieldB=0;
    void bFunction() override { /* implementation */ }
};
class [[gnu::packed]] C : public B{
    public: int fieldC=0;
    void cFunction() override { /* implementation */ }
};
class [[gnu::packed]] D : public B{
    public: int fieldD=0;
};
class [[gnu::packed]] E : public C, public D{

};


int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; // 12
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 20
    std::cout<<"D="<<sizeof(D)<<std::endl; // 20
    std::cout<<"E="<<sizeof(E)<<std::endl; //40
}

saving an additional 8 bytes at the price of possibly some CPU overhead (but if memory is the issue might help).

Additionally if there is really a single function you are calling for each of your classes you should only have that as a single function which you override whenever necessary.

#include<iostream>

class [[gnu::packed]] Base {
public:
    virtual ~Base();
    virtual void specificFunction() { /* implementation for Base class */ };
    int id=0;
};

class [[gnu::packed]] B : public Base{
public:
    void specificFunction() override { /* implementation for B class */ }
    int fieldB=0;
};

class [[gnu::packed]] C : public B{
public:
    void specificFunction() override { /* implementation for C class */ }
    int fieldC=0;
};

class [[gnu::packed]] D : public B{
public:
    void specificFunction() override { /* implementation for D class */ }
    int fieldD=0;
};

class [[gnu::packed]] E : public C, public D{
    void specificFunction() override {
        // implementation for E class, example:
        C::specificFunction();
        D::specificFunction();
    }
};

This would also allow you to avoid having to figure out what class which object is before calling the appropriate function.

Furthermore, assuming your original virtual class inheritance idea is what works best for your application you could restructure your data so that it's more easily accessible for caching purposes while also decreasing the size of your classes and having your functions accessible at the same time:

#include <iostream>
#include <array>

using namespace std;

struct BaseFields {
    int id{0};
};

struct BFields {
    int fieldB;
};

struct CFields {
    int fieldB;
};

struct DFields {
    int fieldB;
};

array<BaseFields, 1024> baseData;
array<BaseFields, 1024> bData;
array<BaseFields, 1024> cData;
array<BaseFields, 1024> dData;

struct indexes {
    uint16_t baseIndex; // index where data for Base class is stored in baseData array
    uint16_t bIndex; // index where data for B class is stored in bData array
    uint16_t cIndex;
    uint16_t dIndex;
};

class Base{
    indexes data;
};
class B : public virtual Base{
    public: void bFunction(){
        //do something about "fieldB"
    }
};
class C : public virtual B{
    public: void cFunction(){
        //do something about "fieldC"
    }
};
class D : public virtual B{
};
class E : public virtual C, public virtual D{};

int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; // 8
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 16
    std::cout<<"D="<<sizeof(D)<<std::endl; // 16
    std::cout<<"E="<<sizeof(E)<<std::endl; // 24
}

Obviously this is just an example and it assumes you don't have more than 1024 objects at a point, you can increase that number but above 65536 you'd have to use a bigger int to store them, also below 256 you can use uint8_t to store the indexes.

Furthermore if one of the structures above adds very little overhead to it's parent you could reduce the number of arrays you use to store the data, if there's very little difference in the size of objects you can just store all the data in a single structure and have more localized memory accesses. That all depends on your application so I can't give more advice here other than to benchmark what works best for your case.

Have fun and enjoy C++.