I have a program which is running on an intel Edison (32bit Yocto Linux). It reads sensor data, and then writes that sensor data to a file. The data comes in packets of 1 int and 13 doubles, with 100 packets arriving every second. Some time later, I will be pulling files off of this and reading those files with a tool running on a x64 windows machine.
At the moment I am writing the data as a raw text file (as strings are nice and portable). However, because of the amount of data that will be written for this, I'm looking for ways to save space. However, I'm trying to figure out a way to do so that no data will be lost in the interpretation of this on the other side.
My initial idea was to go ahead and create a struct that looks like this:
struct dataStruct{
char front;
int a;
double b, c, d, e, f, g, h, i, j, l, m, n, o;
char end;
}
and then do a union of this as follows:
union dataUnion{
dataStruct d;
char[110] c;
}
//110 was chosen because an int = 4 char, and a double = 8 char,
//so 13*8 = 104, and therefore d = 1 + 4 + 13*8 + 1 = 110
and then write the char array to a file. However, a little bit of reading around tells me that an implementation like that might not necessarily be compatible between OS's (worse... it might work some of the time and not other times...).
So I am wondering - is there a portable way to save this data without just saving it as raw text?
As the others said: serialization is probably the best solution for your problem.
Since you are on a resource-constrained environment, I suggest using something like MsgPack. It's header only (given a C++11 compiler), quite light, the format is simple and the C++ interface is nice. It even allows you to serialize user-defined types (i.e. classes/structs) very easily:
// adapted from https://github.com/msgpack/msgpack-c/blob/master/QUICKSTART-CPP.md
#include <msgpack.hpp>
#include <vector>
#include <string>
struct dataStruct {
int a;
double b, c, d, e, f, g, h, i, j, l, m, n, oo; // yes "oo", because "o" clashes with msgpack :/
MSGPACK_DEFINE(a, b, c, d, e, f, g, h, i, j, l, m, n, oo);
};
int main(void) {
std::vector<dataStruct> vec;
// add some elements into vec...
// you can serialize dataStruct directly
msgpack::sbuffer sbuf;
msgpack::pack(sbuf, vec);
msgpack::unpacked msg;
msgpack::unpack(&msg, sbuf.data(), sbuf.size());
msgpack::object obj = msg.get();
// you can convert object to dataStruct directly
std::vector<dataStruct> rvec;
obj.convert(&rvec);
}
As an alternative, you can check out Google's FlatBuffers. It seems quite resource-efficient, but I haven't tried it yet.
EDIT: Here's a complete example illustrating the whole serialization - file I/O - deserialization cycle:
// adapted from:
// https://github.com/msgpack/msgpack-c/blob/master/QUICKSTART-CPP.md
// https://github.com/msgpack/msgpack-c/wiki/v1_1_cpp_unpacker#msgpack-controls-a-buffer
#include <msgpack.hpp>
#include <fstream>
#include <iostream>
using std::cout;
using std::endl;
struct dataStruct {
int a;
double b, c, d, e, f, g, h, i, j, l, m, n, oo; // yes "oo", because "o" clashes with msgpack :/
MSGPACK_DEFINE(a, b, c, d, e, f, g, h, i, j, l, m, n, oo);
};
std::ostream& operator<<(std::ostream& out, const dataStruct& ds)
{
out << "[a:" << ds.a << " b:" << ds.b << " ... oo:" << ds.oo << "]";
return out;
}
int main(void) {
// serialize
{
// prepare the (buffered) output file
std::ofstream ofs("log.bin");
// prepare a data structure
dataStruct ds;
// fill in sample data
ds.a = 1;
ds.b = 1.11;
ds.oo = 101;
msgpack::pack(ofs, ds);
cout << "serialized: " << ds << endl;
ds.a = 2;
ds.b = 2.22;
ds.oo = 202;
msgpack::pack(ofs, ds);
cout << "serialized: " << ds << endl;
// continuously receiving data
//while ( /* data is being received... */ ) {
//
// // initialize ds...
//
// // serialize ds
// // You can use any classes that have the following member function:
// // https://github.com/msgpack/msgpack-c/wiki/v1_1_cpp_packer#buffer
// msgpack::pack(ofs, ds);
//}
}
// deserialize
{
// The size may decided by receive performance, transmit layer's protocol and so on.
// prepare the input file
std::ifstream ifs("log.bin");
std::streambuf* pbuf = ifs.rdbuf();
const std::size_t try_read_size = 100; // arbitrary number...
msgpack::unpacker unp;
dataStruct ds;
// read data while there are still unprocessed bytes...
while (pbuf->in_avail() > 0) {
unp.reserve_buffer(try_read_size);
// unp has at least try_read_size buffer on this point.
// input is a kind of I/O library object.
// read message to msgpack::unpacker's internal buffer directly.
std::size_t actual_read_size = ifs.readsome(unp.buffer(), try_read_size);
// tell msgpack::unpacker actual consumed size.
unp.buffer_consumed(actual_read_size);
msgpack::unpacked result;
// Message pack data loop
while(unp.next(result)) {
msgpack::object obj(result.get());
obj.convert(&ds);
// use ds
cout << "deserialized: " << ds << endl;
}
// All complete msgpack message is proccessed at this point,
// then continue to read addtional message.
}
}
}
Output:
serialized: [a:1 b:1.11 ... oo:101]
serialized: [a:2 b:2.22 ... oo:202]
deserialized: [a:1 b:1.11 ... oo:101]
deserialized: [a:2 b:2.22 ... oo:202]