What's the "price" of using inheritance in an embedded environment with C++?

I'm starting a new embedded project with C++ and I was wondering if it is too much expensive to use a interface oriented design. Something like this:

typedef int data;

class data_provider {

public:
    virtual data get_data() = 0;
};

class specific_data_provider : public data_provider {
public:
    data get_data() {
    return 7;
    }
};

class my_device {
public:
    data_provider * dp;
    data d;

    my_device (data_provider * adp) {
    dp = adp;
    d = 0;
    }

    void update() {
    d = dp->get_data();
    }
};

int
main() {
    specific_data_provider sdp;
    my_device dev(&sdp);

    dev.update();

    printf("d = %d\n", dev.d);

    return 0;
}

Solution

Inheritance, on its own, is free. For example, below, B and C are the same from a performance/memory point of view:

struct A { int x; };
struct B : A { int y; };
struct C { int x, y; };

Inheritance only incurs a cost when you have virtual functions.

struct A { virtual ~A(); };
struct B : A { ... };

Here, on virtually all implementations, both A and B will be one pointer size larger due to the virtual function.

Virtual functions also have other drawbacks (when compared with non-virtual functions)

Virtual functions require that you look up the vtable when called. If that vtable is not in the cache then you will get an L2 miss, which can be incredibly expensive on embedded platforms (over 600 cycles on current gen game consoles for example).
Even if you hit the L2 cache, if you branch to many different implementations then you will likely get a branch misprediction on most calls, causing a pipeline flush, which again costs many cycles.
You also miss out on many optimisation opportunities due to virtual functions being essentially impossible to inline (except in rare cases). If the function you call is small then this could add a serious performance penalty compared to a inlined non-virtual function.
Virtual calls can contribute to code bloat. Every virtual function call adds several bytes worth of instructions to lookup the vtable, and many bytes for the vtable itself.

If you use multiple inheritance then things get worse.

Often people will tell you "don't worry about performance until your profiler tells you to", but this is terrible advice if performance is at all important to you. If you don't worry about performance then what happens is that you end up with virtual functions everywhere, and when you run the profiler, there is no one hotspot that needs optimising -- the whole code base needs optimising.

My advice would be to design for performance if it is important to you. Design to avoid the need for virtual functions if at all possible. Design your data around the cache: prefer arrays to node-based data structures like std::list and std::map. Even if you have a container of a few thousand elements with frequent insertions into the middle, I would still go for an array on certain architectures. The several thousand cycles you lose copying data for the insertions may well be offset by the cache locality you will achieve on each traversal (Remember the cost of a single L2 cache miss? You can expect a lot of those when traversing a linked list)