Search code examples
c++boostboost-serialization

Boost serialization of reference member abstract class


I'm trying to figure out how to serialize a class that I put together with Boost. I'll get right to the code:

#ifndef TEST_H_
#define TEST_H_

#include <iostream>
#include <boost/serialization/serialization.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>

class Parent
{
public:
        int test_val = 1234234;
        int p()
        {
                return 13294;
        }
        int get_test_val()
        {
                std::cout << test_val << std::endl;
                return test_val;
        }
        friend class boost::serialization::access;
        template<class Archive>
            void serialize(Archive &ar, const unsigned int /*version*/)
        {
                ar &test_val;
        }
};

class RefMem : public Parent
{
public: 
        RefMem()
        {
                test_val = 12342;
                std::cout << test_val << std::endl;
        }
};


class Test
{
public:
        friend class boost::serialization::access;
        int t_;
        Parent &parent_;
        Test(int t, Parent &&parent = RefMem());
        template<class Archive>
        void serialize(Archive &ar, const unsigned int file_version){
                ar &t_;
                ar &parent_;
        }
        //template<class
};


#endif

#include "test.h"
#include <iostream>
#include <sstream>
#include <boost/serialization/serialization.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>

Test :: Test(int t, Parent &&parent) : parent_(parent)
{
        std::cout << this->parent_.test_val << std::endl;
        t_ = t;
        parent_ = parent;
}

    int main()
    {
            Test test = Test(50);
            std::cout << "t_: " << test.t_ << std::endl;
            std::cout << "Test val: " << test.parent_.get_test_val() << std::endl;
            std::ostringstream oss;
            {
                    boost::archive::text_oarchive oa(oss);
                    oa << test;
            }

            Test cloned;
            std::istringstream iss(oss.str());
                {   
                    boost::archive::text_iarchive ia(iss);
                    ia >> cloned;
                }


            std::cout << "t_: " << cloned.t_ << std::endl;
            std::cout << "Test val: " << cloned.parent_.get_test_val() << std::endl;
    }

I'm basically shooting in the dark. I'm new to C++ and I could get a basic example to work, but nothing like this where I serialize a reference member that is a child of an abstract class and then deserialize it. This code is just replicating what I'm trying to do in another program. I have a few random functions/variables just for testing.

Edit: How would I get this code to compile and work properly?


Solution

  • You're confused about the ownership semantics of references.

    1. The reference parent_ merely "points" to an instance of RefMem¹. When you serialize, it's "easy" to write these (because they're lvalue-references, the value itself will have been serialized).

      However for deserialization, things are not so simple, simply because we do-not have an instance of MemRef to "point" to. We could expect Boost Serialization to (somehow) dynamically instantiate a MemRef out of thin air and silently make the reference point to it. However, at best this will lead to memory leaks.

    2. There's another thing about reference members specifically. Reference member can only be initialized in the constructor's initializer list.

      Because Boost Serialization serializes values it does not construct these objects, and the question is how the reference can even be initialized at all.

      Your current constructor has a number of related issues:

      Test(int t, Parent && parent = RefMem()) : parent_(parent) {
          std::cout << __FUNCTION__ << ":" << this->parent_.test_val << "\n";
          t_      = t;
          parent_ = parent; // OOPS! TODO FIXME
      }
      
      • firstly, the constructor disables the compiler-generated default constructor, so that, indeed, the line Test cloned; couldn't even compile
      • secondly, the default argument for parent is a rvalue-reference and it becomes dangling as soon as the constructor returns. Your program has Undefined Behaviour
      • Thirdly the line

        parent_ = parent; // OOPS! TODO FIXME
        

        doesn't do what you think it does. It copies the value of the Parent object from parent over the object referred to by parent_. This is likely not visible as parent_ and parent are the same object here, but there's even Object Slicing involved (What is object slicing?).


    What do?

    Best to regroup and hit the documentation for Serialization of References:

    Classes that contain reference members will generally require non-default constructors as references can only be set when an instance is constructed. The example of the previous section is slightly more complex if the class has reference members. This raises the question of how and where the objects being referred to are stored and how are they created. Also there is the question about references to polymorphic base classes. Basically, these are the same questions that arise regarding pointers. This is no surprise as references are really a special kind of pointer.

    We address these questions by serializing references as though they were pointers.

    (emphasis mine)

    The documentation does go on to suggest load_construct_data/save_construct_data to alleviate the non-default-constructibility of Test.

    Note that their suggestion to handle the reference member as a pointer seems nice, but it only makes sense if the actual pointed-to object is also serialized through a pointer in the same archive. In such case Object Tracking will spot the aliasing pointer and avoid creating a duplicate instance.

    If not, you'll still have your memory leak, and possibly broken program state.

    Demo Using load/save_construct_data

    Here's a demo of essentially the technique outlined above. Note that we're leaking the dynamically allocated objects. I don't like this style because it's essentially treating the reference as if it were a pointer.

    If that's what we want, we should consider using pointers (see below)

    Live On Coliru

    #ifndef TEST_H_
    #define TEST_H_
    
    #include <iostream>
    #include <boost/serialization/serialization.hpp>
    #include <boost/archive/binary_oarchive.hpp>
    #include <boost/archive/binary_iarchive.hpp>
    
    class Parent {
      public:
        int test_val = 1234234;
    
        int p() { return 13294; }
    
        int get_test_val() {
            std::cout << __PRETTY_FUNCTION__ << ":" << test_val << "\n";
            return test_val;
        }
    
        template <class Archive> void serialize(Archive &ar, unsigned) {
            ar & test_val; 
        }
    };
    
    class RefMem : public Parent {
      public:
        RefMem() {
            test_val = 12342;
            std::cout << __PRETTY_FUNCTION__ << ":" << test_val << "\n";
        }
    };
    
    class Test {
      public:
        friend class boost::serialization::access;
        int t_;
        Parent &parent_;
    
        Test(int t, Parent& parent) : parent_(parent) {
            std::cout << __PRETTY_FUNCTION__ << ":" << this->parent_.test_val << "\n";
            t_      = t;
        }
    
        template <class Archive> void serialize(Archive &ar, const unsigned int file_version) {
            ar &t_;
            //ar &parent_; // how would this behave? We don't own it... Use pointers
        }
        // template<class
    };
    
    namespace boost { namespace serialization {
        template<class Archive>
            inline void save_construct_data(Archive & ar, const Test * t, const unsigned int file_version) {
                // save data required to construct instance
                ar << t->t_;
                // serialize reference to Parent as a pointer
                Parent* pparent = &t->parent_;
                ar << pparent;
            }
    
        template<class Archive>
            inline void load_construct_data(Archive & ar, Test * t, const unsigned int file_version) {
                // retrieve data from archive required to construct new instance
                int m;
                ar >> m;
                // create and load data through pointer to Parent
                // tracking handles issues of duplicates.
                Parent * pparent;
                ar >> pparent;
                // invoke inplace constructor to initialize instance of Test
                ::new(t)Test(m, *pparent);
            }
    }}
    
    #endif
    
    #include <iostream>
    #include <sstream>
    #include <boost/serialization/serialization.hpp>
    #include <boost/archive/text_oarchive.hpp>
    #include <boost/archive/text_iarchive.hpp>
    
    int main() {
        Parent* the_instance = new RefMem;
    
        Test test = Test(50, *the_instance);
    
        std::cout << "t_: " << test.t_ << "\n";
        std::cout << "Test val: " << test.parent_.get_test_val() << "\n";
        std::ostringstream oss;
        {
            boost::archive::text_oarchive oa(oss);
            Test* p = &test;
            oa << the_instance << p; // NOTE SERIALIZE test AS-IF A POINTER
        }
    
        {
            Parent* the_cloned_instance = nullptr;
            Test* cloned = nullptr;
    
            std::istringstream iss(oss.str());
            {
                boost::archive::text_iarchive ia(iss);
                ia >> the_cloned_instance >> cloned;
            }
    
            std::cout << "t_: " << cloned->t_ << "\n";
            std::cout << "Test val: " << cloned->parent_.get_test_val() << "\n";
            std::cout << "Are Parent objects aliasing: " << std::boolalpha << 
                (&cloned->parent_ == the_cloned_instance) << "\n";
        }
    }
    

    Prints

    RefMem::RefMem():12342
    Test::Test(int, Parent&):12342
    t_: 50
    int Parent::get_test_val():12342
    Test val: 12342
    Test::Test(int, Parent&):12342
    t_: 50
    int Parent::get_test_val():12342
    Test val: 12342
    Are Parent objects aliasing: true
    

    Alternatively: say what we want

    To avoid the leakiness and the usability issues associated with reference members, let's use a shared_ptr instead!

    Live On Coliru

    #include <iostream>
    #include <boost/serialization/serialization.hpp>
    #include <boost/serialization/shared_ptr.hpp>
    #include <boost/archive/text_oarchive.hpp>
    #include <boost/archive/text_iarchive.hpp>
    #include <boost/make_shared.hpp>
    
    class Parent {
      public:
        int test_val = 1234234;
    
        int p() { return 13294; }
    
        int get_test_val() {
            std::cout << __PRETTY_FUNCTION__ << ":" << test_val << "\n";
            return test_val;
        }
    
        template <class Archive> void serialize(Archive &ar, unsigned) {
            ar & test_val; 
        }
    };
    
    class RefMem : public Parent {
      public:
        RefMem() {
            test_val = 12342;
            std::cout << __PRETTY_FUNCTION__ << ":" << test_val << "\n";
        }
    };
    
    using ParentRef = boost::shared_ptr<Parent>;
    
    class Test {
      public:
        int t_ = 0;
        ParentRef parent_;
    
        Test() = default;
        Test(int t, ParentRef parent) : t_(t), parent_(parent) { }
    
        template <class Archive> void serialize(Archive &ar, const unsigned int file_version) {
            ar & t_ & parent_;
        }
    };
    
    #include <sstream>
    
    int main() {
        ParentRef the_instance = boost::make_shared<RefMem>();
    
        Test test = Test(50, the_instance);
    
        std::cout << "t_: " << test.t_ << "\n";
        std::cout << "Test val: " << test.parent_->get_test_val() << "\n";
        std::ostringstream oss;
        {
            boost::archive::text_oarchive oa(oss);
            oa << the_instance << test; // NOTE SERIALIZE test AS-IF A POINTER
        }
    
        {
            ParentRef the_cloned_instance;
            Test cloned;
    
            std::istringstream iss(oss.str());
            {
                boost::archive::text_iarchive ia(iss);
                ia >> the_cloned_instance >> cloned;
            }
    
            std::cout << "t_: " << cloned.t_ << "\n";
            std::cout << "Test val: " << cloned.parent_->get_test_val() << "\n";
            std::cout << "Are Parent objects aliasing: " << std::boolalpha << 
                (cloned.parent_ == the_cloned_instance) << "\n";
        }
    }
    

    Note that there is no complication anymore. No memory leaks, not even when you don't serialize the RefMem instance separately. And the object tracking works fine with shared pointers (as implemented through boost/serialization/shared_pointer.hpp).


    ¹ or anything else deriving from Parent, obviously