Search code examples
c++encodingdeserialization

How to pack all arrays of bytes of data members everything in a single vector?


I have a created a function serialize which takes the Data { a class containing 4 members int32,int64,float,double) as input and returns a encoded vector of bytes of all elements which I will further pass to deserialize function to get the original data back.

std::vector<uint8_t> serialize(Data &D)
{

    std::vector<uint8_t> seriliazed_data;
    std::vector<uint8_t> intwo = encode(D.Int32);  // output [32 13 24 0]
    std::vector<uint8_t> insf = encode(D.Int64);    // output [233 244 55 134 255 23 55] 
    // float
    float ft = D.Float;    // float value eg 4.55 
    float *a;                 // I will encode them in binary format
    char result[sizeof(float)];
    memcpy(result, &ft, sizeof(ft));
    // double
    double dt = D.Double;    // double value eg 4.55 
    double *c;                 // I will encode them in binary format
    char resultdouble[sizeof(double)];
    memcpy(resultdouble, &dt, sizeof(dt));
       /////
       ///// How to bind everything  here
       /////

    return seriliazed_data;
}


 Data deserialize(std::vector<uint8_t> &Bytes)  /// Vector returned from above function { 
    
     Data D2;
  
    D2.Int64 = decode(Bytes, D2);
    // D2.Int32 = decode(Bytes, D2);
    // D2.float = decode(Bytes, D2);
    // D2.double = decode(Bytes, D2);
    
    /// Return original data ( All class members)
    return D2;
}

I don't have any idea, of how to move forward.. Q1. If I bind everything in a single vector, how would I dissect them while deserializing. there should be some kind of delimiter? Q2. Is there any better way of doing it.


Solution

  • If I bind everything in a single vector, how would I dissect them while deserializing. there should be some kind of delimiter?

    In a stream, you either know what type that comes next - or you'll have to have some sort of type indicator in the stream. "Here comes a vector of int with size ..." etc:

    vector int size elem1 elem2 ... elemX
    

    Depending on how many types you need to support, the type information could be 1 or more bytes. If the smallest "unknown" entities are your classes, then you need one indicator per class you aim to support.

    If you know exactly what should be in the stream, the type information for vector and int could be left out:

    size elem1 elem2 ... elemX
    

    Q2. Is there any better way of doing it.

    One simplification could be make serialize more generic so you could reuse it. If you have some

    std::vector<uint8_t> encode(conts T& x)
    

    overloads for the fundamental types (and perhaps container types) you'd like to support, you could make it something like this:

    template <class... Ts>
    std::vector<uint8_t> serialize(Ts&&... ts) {
        std::vector<uint8_t> serialized_data;
    
        [](auto& data, auto&&... vs) {
            (data.insert(data.end(), vs.begin(), vs.end()), ...);
        }(serialized_data, encode(ts)...);
    
        return serialized_data;
    }
    

    You could then write serialization for a class simply by calling serialize with all the member variables and you could make serialization of composit types pretty easy:

    struct Foo {
        int32_t x;                  // encode(int32_t) needed
        std::string y;              // encode(const string&) needed
        std::vector<std::string> z; // encode(const vector<T>&) + encode(const string&)
    };
    
    std::vector<uint8_t> encode(const Foo& f) {
        return serialize(f.x, f.y, f.z);
    }
    
    struct Bar {
        Foo f;                      // encode(const Foo&) needed
        std::string s;              // encode(const string&) needed
    };
    
    std::vector<uint8_t> encode(const Bar& b) {
        return serialize(b.f, b.s);
    }
    

    The above makes encoding of classes pretty straight forward. To add serialization, you could add an adapter which simply references the object to serialize, encodes it and writes the encoded data to an ostream:

    struct BarSerializer {
        Bar& b;
        friend std::ostream& operator<<(std::ostream& os, const BarSerializer& bs) {
            auto s = encode(bs.b);  // encode(const Bar&) needed
            return os.write(reinterpret_cast<const char*>(s.data()), s.size());
        }    
    };
    

    You'd make the deserialize function template and decode overloads in a similar manner.