Problem
I'm having trouble using Boost Serialization library with a recursive data structure. More precisely, I want to serialize a matrix which is represented by nodes containing a value, and where each node has access to its neighbors (top,bottom,left,right). In order to access a node, each entry point is stored in a vector
(that is the first and the last node of each row and and each column) . Here is the Node
class :
class Node
{
private:
int v;
Node* left;
Node* right;
Node* top;
Node* bottom;
public:
Node() : v(rand() % 100), left(NULL), right(NULL), top(NULL), bottom(NULL)
{
}
//Potential memory leak but that's not the topic
void setLeft(Node* toSet) { left = toSet; }
void setRight(Node* toSet) { right = toSet; }
void setTop(Node* toSet) { top = toSet; }
void setBottom(Node* toSet) { bottom = toSet; }
Node* gLeft() { return left; }
Node* gRight() { return right; }
Node* gTop() { return top; }
Node* gBottom() { return bottom; }
int gValue() { return v; }
template<class Archive>
void serialize(Archive& ar, const unsigned int version)
{
ar& v;
ar& left;
ar& right;
ar& top;
ar& bottom;
}
};
Here is the Matrix
class, with a generateValues()
function for this example :
class Matrix
{
private:
int m, n;
std::vector<Node*> firstNodesPerRow;
std::vector<Node*> lastNodesPerRow;
std::vector<Node*> firstNodesPerCol;
std::vector<Node*> lastNodesPerCol;
public:
Matrix(int m, int n) :
m(m), n(n),
firstNodesPerRow(m, NULL), lastNodesPerRow(m,NULL),
firstNodesPerCol(n, NULL),lastNodesPerCol(n, NULL)
{
}
void generateValues()
{
for (int i = 0; i < m; i++)
{
for (int j = 0; j < n; j++)
{
Node* toWrite = new Node();
if (i > 0)
{
toWrite->setTop(lastNodesPerCol.at(j));
lastNodesPerCol.at(j)->setBottom(toWrite);
lastNodesPerCol.at(j) = toWrite;
}
else
{
firstNodesPerCol.at(j) = toWrite;
lastNodesPerCol.at(j) = toWrite;
}
if (j > 0)
{
toWrite->setLeft(lastNodesPerRow.at(i));
lastNodesPerRow.at(i)->setRight(toWrite);
lastNodesPerRow.at(i) = toWrite;
}
else
{
firstNodesPerRow.at(i) = toWrite;
lastNodesPerRow.at(i) = toWrite;
}
}
}
}
template<class Archive>
void serialize(Archive& ar, const unsigned int version)
{
ar& m;
ar& n;
ar& firstNodesPerRow;
ar& firstNodesPerCol;
ar& lastNodesPerRow;
ar& lastNodesPerCol;
}
};
So what I want to achieve is serializing and deserializing a Matrix
. Here is my main
function :
#include <cstdlib>
#include <sstream>
#include <vector>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/serialization/vector.hpp>
int main(int argc, char* argv[])
{
int m = 10; int n = 10;
Matrix toSerialize(m,n);
toSerialize.generateValues();
/*
1) Serialize
*/
std::ostringstream oss;
boost::archive::text_oarchive oa(oss);
oa << toSerialize;
std::string serialiazedData = oss.str();
/*
2) Deserialize
*/
Matrix result(m,n);
std::stringstream serializedDataStream(serialiazedData);
boost::archive::text_iarchive ia(serializedDataStream);
ia >> result;
return EXIT_SUCCESS;
}
The problem is the following : given a sufficiently large value m
or n
, main
ends up with a stack-overflow
exception. I know that's it's coming from the serialize
method of Node
, because in order to serialize a node, it needs to serialize the neighbors and so on... I found this topic which seems to be exactly the same question. The answer is interesting as a starting point but I'm having trouble to implement it because it's not sufficiently precise. What I understand is that to solve my problem, I need to :
stack-overflow
;top,bottom,right,left
of Node
I'm having trouble to actually implement this solution, because the only way I can imagine to do point 1 implies to remove the serialization of top,bottom,right,left
in the serialize
method of Node
, but then I can't achieve point 2?
Edit : I made a diagram of a matrix to help reading. Note that there is not necessarily m x n
nodes in a m x n
matrix.
Edit 2 : The solution I had in mind (not working, ends up with a stack-overflow when deserializing).
//Class Node
template<class Archive>
void serialize(Archive& ar, const unsigned int version)
{
ar& v;
}
//Class Matrix
template<class Archive>
void serialize(Archive& ar, const unsigned int version)
{
ar& m;
ar& n;
for (int i = 0; i < m; i++)
{
Node* current = firstNodesPerRow.at(i);
while (1)
{
if (current == NULL) { break; }
ar& current;
current = current->gRight();
}
}
ar& firstNodesPerRow;
ar& firstNodesPerCol;
ar& lastNodesPerRow;
ar& lastNodesPerCol;
}
Solution
The explanation of the solution is given in the post marked as the answer. Here is an implementation of this solution :
// class Node
template<class Archive>
void serialize(Archive& ar, const unsigned int version)
{
ar& v;
}
// some buffer struct
struct Neighbors
{
Node* top;
Node* bottom;
Node* left;
Node* right;
template <typename Archive>
void serialize(Archive& ar, const unsigned int version)
{
ar& top;
ar& bottom;
ar& left;
ar& right;
}
};
//class Matrix
template<class Archive>
void save(Archive& ar, const unsigned int version) const
{
std::map<Node*, Neighbors> neighborsPerNode;
for (int i = 0; i < m; i++)
{
Node* current = firstNodesPerRow.at(i);
while (1)
{
if (current == NULL) { break; }
neighborsPerNode[current] = {
current->gTop(),
current->gBottom(),
current->gLeft(),
current->gRight(),
};
current = current->gRight();
}
}
ar& neighborsPerNode;
ar& m;
ar& n;
ar& firstNodesPerRow;
ar& firstNodesPerCol;
ar& lastNodesPerRow;
ar& lastNodesPerCol;
}
template<class Archive>
void load(Archive& ar, const unsigned int version)
{
// Warning ALL the nodes are browsed 2 times :
// 1 - to deserialize neighborsPerNode (below)
// 2 - in the for loop, to create the links between the nodes
std::map<Node*, Neighbors> neighborsPerNode;
ar& neighborsPerNode;
ar& m;
ar& n;
ar& firstNodesPerRow;
ar& firstNodesPerCol;
ar& lastNodesPerRow;
ar& lastNodesPerCol;
for (int i = 0; i < m; i++)
{
Node* current = firstNodesPerRow.at(i);
while (1)
{
if (current == NULL) { break; }
Neighbors myNeighbors = neighborsPerNode.at(current);
current->setTop(myNeighbors.top);
current->setBottom(myNeighbors.bottom);
current->setLeft(myNeighbors.left);
current->setRight(myNeighbors.right);
current = current->gRight();
}
}
}
BOOST_SERIALIZATION_SPLIT_MEMBER()
Why not simply serialize all elements in order of the matrix and avoid the function call recursion entirely, e.g.:
template<class Archive>
void serialize(Archive& ar, const unsigned int version)
{
ar& m;
ar& n;
for (int i = 0; i < m; ++i)
{
Node* node = firstNodesPerRow[i];
for (int j = 0; j < n; ++j)
{
ar & node->gValue();
node = node->gRight();
}
}
}
BTW This works in saving the matrix. I think you need to specialize the serialize function in save and load versions, because for the load version you need to:
template<class Archive>
void save(Archive & ar, const unsigned int version) const
{
...
}
template<class Archive>
void load(Archive & ar, const unsigned int version)
{
...
}
BOOST_SERIALIZATION_SPLIT_MEMBER()
In the complicated case where nodes may be missing, the same idea applies. And you still need to split between save/load, adding allocation to load. But it takes a lot more bookkeeping to correctly save & load again.
E.g., you could first iterate over all nodes as above but creating a map from each Nodes pointer value to an unique increasing ID number. When you save each node's value (as above row by row), also save each direction's node ID. When you load, you first make a empty map: ID -> node pointer. Then restore nodes again row by row, while restoring neighbour pointers as well from map. Of course whenever an ID is first encountered you need to allocate a new node.