Search code examples
c++paradigms

C++ Techniques: Type-Erasure vs. Pure Polymorphism


What are the advantages/disadvantages of the two techniques in comparison ? And more importantly: Why and when should one be used over the other ? Is it just a matter of personal taste/preference ?

To the best of my abilities, I haven't found another post that explicitly addresses my question. Among many questions regarding the actual use of polymorphism and/or type-erasure, the following seems to be closest, or so it seemed, but it doesn't really address my question either:

C++ -& CRTP . Type erasure vs polymorphism

Please, note that I very well understand both techniques. To this end, I provide a simple, self-contained, working example below, which I'm happy to remove, if it is felt unnecessary. However, the example should clarify what the two techniques mean with respect to my question. I'm not interested in discussing nomenclatures. Also, I know the difference between compile- and run-time polymorphism, though I wouldn't consider this to be relevant to the question. Note that my interest is less in performance-differences, if there are any. However, if there was a striking argument for one or the other based on performance, I'd be curious to read it. In particular, I would like to hear about concrete examples (no code) that would really only work with one of the two approaches.

Looking at the example below, one primary difference is the memory-management, which for polymorphism remains on the user-side, and for type-erasure is neatly tucked away requiring some reference-counting (or boost). Having said that, depending on the usage scenarios, the situation might be improved for the polymorphism-example by using smart-pointers with the vector (?), though for arbitrary cases this may very well turn out to be impractical (?). Another aspect, potentially in favor of type-erasure, may be the independence of a common interface, but why exactly would that be an advantage (?).

The code as given below was tested (compiled & run) with MS VisualStudio 2008 by simply putting all of the following code-blocks into a single source-file. It should also compile with gcc on Linux, or so I hope/assume, because I see no reason why not (?) :-) I have split/divided the code here for clarity.

These header-files should be sufficient, right (?).

#include <iostream>
#include <vector>
#include <string>

Simple reference-counting to avoid boost (or other) dependencies. This class is only used in the type-erasure-example below.

class RefCount
{
  RefCount( const RefCount& );
  RefCount& operator= ( const RefCount& );
  int m_refCount;

  public:
    RefCount() : m_refCount(1) {}
    void Increment() { ++m_refCount; }
    int Decrement() { return --m_refCount; }
};

This is the simple type-erasure example/illustration. It was copied and modified in part from the following article. Mainly I have tried to make it as clear and straightforward as possible. http://www.cplusplus.com/articles/oz18T05o/

class Object {
  struct ObjectInterface {
    virtual ~ObjectInterface() {}
    virtual std::string GetSomeText() const = 0;
  };

  template< typename T > struct ObjectModel : ObjectInterface {
    ObjectModel( const T& t ) : m_object( t ) {}
    virtual ~ObjectModel() {}
    virtual std::string GetSomeText() const { return m_object.GetSomeText(); }
    T m_object;
 };

  void DecrementRefCount() {
    if( mp_refCount->Decrement()==0 ) {
      delete mp_refCount; delete mp_objectInterface;
      mp_refCount = NULL; mp_objectInterface = NULL;
    }
  }

  Object& operator= ( const Object& );
  ObjectInterface *mp_objectInterface;
  RefCount *mp_refCount;

  public:
    template< typename T > Object( const T& obj )
      : mp_objectInterface( new ObjectModel<T>( obj ) ), mp_refCount( new RefCount ) {}
    ~Object() { DecrementRefCount(); }

    std::string GetSomeText() const { return mp_objectInterface->GetSomeText(); }

    Object( const Object &obj ) {
      obj.mp_refCount->Increment(); mp_refCount = obj.mp_refCount;
      mp_objectInterface = obj.mp_objectInterface;
    }
};

struct MyObject1 { std::string GetSomeText() const { return "MyObject1"; } };
struct MyObject2 { std::string GetSomeText() const { return "MyObject2"; } };

void UseTypeErasure() {
  typedef std::vector<Object> ObjVect;
  typedef ObjVect::const_iterator ObjVectIter;

  ObjVect objVect;
  objVect.push_back( Object( MyObject1() ) );
  objVect.push_back( Object( MyObject2() ) );

  for( ObjVectIter iter = objVect.begin(); iter != objVect.end(); ++iter )
    std::cout << iter->GetSomeText();
}

As far as I'm concerned, this seems to achieve pretty much the same using polymorphism, or maybe not (?).

struct ObjectInterface {
  virtual ~ObjectInterface() {}
  virtual std::string GetSomeText() const = 0;
};

struct MyObject3 : public ObjectInterface {
  std::string GetSomeText() const { return "MyObject3"; } };

struct MyObject4 : public ObjectInterface {
  std::string GetSomeText() const { return "MyObject4"; } };

void UsePolymorphism() {
  typedef std::vector<ObjectInterface*> ObjVect;
  typedef ObjVect::const_iterator ObjVectIter;

  ObjVect objVect;
  objVect.push_back( new MyObject3 );
  objVect.push_back( new MyObject4 );

  for( ObjVectIter iter = objVect.begin(); iter != objVect.end(); ++iter )
    std::cout << (*iter)->GetSomeText();

  for( ObjVectIter iter = objVect.begin(); iter != objVect.end(); ++iter )
    delete *iter;
}

And finally for testing all of the above together.

int main() {
  UseTypeErasure();
  UsePolymorphism();
  return(0);
}

Solution

  • C++ style virtual method based polymorphism:

    1. You have to use classes to hold your data.
    2. Every class has to be built with your particular kind of polymorphism in mind.
    3. Every class has a common binary-level dependency, which restricts how the compiler creates the instance of each class.
    4. The data you are abstracting must explicitly describe an interface that describes your needs.

    C++ style template based type erasure (with virtual method based polymorphism doing the erasure):

    1. You have to use template to talk about your data.
    2. Each chunk of data you are working on may be completely unrelated to other options.
    3. The type erasure work is done within public header files, which bloats compile time.
    4. Each type erased has its own template instantiated, which can bloat binary size.
    5. The data you are abstracting need not be written as being directly dependent on your needs.

    Now, which is better? Well, that depends if the above things are good or bad in your particular situation.

    As an explicit example, std::function<...> uses type erasure which allows it to take function pointers, function references, output of a whole pile of template-based functions that generate types at compile time, myraids of functors which have an operator(), and lambdas. All of these types are unrelated to one another. And because they aren't tied to having a virtual operator(), when they are used outside of the std::function context the abstraction they represent can be compiled away. You couldn't do this without type erasure, and you probably wouldn't want to.

    On the other hand, just because a class has a method called DoFoo, doesn't mean that they all do the same thing. With polymorphism, it isn't just any DoFoo you are calling, but the DoFoo from a particular interface.

    As for your sample code... your GetSomeText should be virtual ... override in the polymorphism case.

    There is no need to reference count just because you are using type erasure. There is no need not to use reference counting just because you are using polymorphsm.

    Your Object could wrap T*s like how you stored vectors of raw pointers in the other case, with manual destruction of their contents (equivalent to having to call delete). Your Object could wrap a std::shared_ptr<T>, and in the other case you could have vector of std::shared_ptr<T>. Your Object could contain a std::unique_ptr<T>, equivalent to having a vector of std::unique_ptr<T> in the other case. Your Object's ObjectModel could extract copy constructors and assignment operators from the T and expose them to Object, allowing full-on value semantics for your Object, which corresponds to the a vector of T in your polymorphism case.