Search code examples
c++undefined-behaviordowncast

Downcasting base class instance to empty child interface


I was wondering about the validity of downcasting a base child to an empty interface child class. See example below. Basically I want to store data in a generic template free way (reading data from a file that can be any simple arithmetic type. std::is_arthmetic<T> should always be true in the following example).

#include <iostream>
#include <vector>

/**
 * Mother class holding data.
 */
class NonTypedData
{
    std::vector<uint8_t> data_;

    public:

    NonTypedData(std::size_t size) : data_(size) {}

    std::size_t size() const { return data_.size(); }

    const uint8_t* get() const { return data_.data(); }
          uint8_t* get()       { return data_.data(); }
};

/**
 * Empty child interface class that handles typed access
 */
template <typename T>
struct TypedData : public NonTypedData
{
    static_assert(std::is_arithmetic<T>::value, "TypedData not supported for non arithmetic types.");

    const T* get() const { return reinterpret_cast<const T*>(this->NonTypedData::get()); }
          T* get()       { return reinterpret_cast<      T*>(this->NonTypedData::get()); }
    std::size_t size() const { return this->NonTypedData::size() / sizeof(T); }
};

template <typename T>
void fill(TypedData<T>& data)
{
    for(std::size_t i = 0; i < data.size(); i++) {
        data.get()[i] = i;
    }
}

template <typename T>
void print(const TypedData<T>& data)
{
    for(std::size_t i = 0; i < data.size(); i++) {
        std::cout << ' ' << data.get()[i];
    }
    std::cout << std::endl;
}

int main()
{
    NonTypedData data(10*sizeof(int));

    fill(static_cast<TypedData<int>&>(data));
    print(static_cast<const TypedData<int>&>(data));

    return 0;
}


I strongly feel that this is undefined-behavior but I am not sure about it. Is it?

In the other hand, I find this king of construct very useful and I feel there is probably a way to do something like this without being in UB-land. What do you think? A reference to NonTypedData instance in a TypedData instance maybe?


Solution

  • static_cast<const TypedData<int>&>(data)
    

    You can't do that. You mustn't use a static_cast for a down-cast to a type unrelated to the actual type at the storage location of data. That part is already UB.

    reinterpret_cast<const T*>(this->NonTypedData::get());
    

    You can't do this either. You can only cast this back to T if the pointer has originally pointed to storage location of type T. You allocated storage explicitly for uint8_t which has weaker (weakest possible) requirements.

    So in order to access it, you need to state (with std::bit_cast) that you are explicitly accessing storage of an incompatible type.

    This makes a huge difference in the compiled code! The compiler may emit instructions with hard alignment constraints when accessing the storage location of the assumed type. A std::bit_cast or memcpy will not actually "copy" the data (with optimizations enabled), but it will force the compiler to be careful about not assuming alignment and alike.


    It's both the same basic mistake: C++ will permit a lot of static / reinterpreting casts between pointers, but the result of the cast (except when casting up to pointers to byte/uint8_t, they are special) is only ever defined if the pointer was originally pointing at an object of the target type.

    The cast itself is already UB. Accessing the memory by a pointer obtained via UB is twice so.