My use case is I have a C++ object of type map<string, map<string, torch::Tensor>>
which I want to serialize, with two functions
#include <torch/torch.h>
using namespace std;
void save_tensor_map(map<string, map<string, torch::Tensor>> m, string fp) {
//
}
map<string, map<string, torch::Tensor>> read_tensor_map(string fp) {
//
}
What is the simplest way to do this?
Here would be my attempt at writing your map in a file. I think you can deduce the read function from it. I don't have a compiler at hand right now to test it, please tell me if it raises issues.
void save_tensor_map(const std::map<std::string, torch::Tensor>& map, const std::string& filename) {
auto out_file = std::fstream(filename, std::ios::out | std::ios::binary);
for(auto itr = map.begin(); itr != map.end(); ++itr) {
// writing the key
auto key = itr->first;
size_t size = key.size();
out_file.write((char*)&size, sizeof(size));
out_file.write(&key[0], size);
// Writing tensor metadata
auto tensor = itr->second;
const at::IntArrayRef& sizes = tensor.sizes();
int64_t nb_dims = sizes.size();
out_file.write((char*)&nb_dims, sizeof(nb_dims));
out_file.write((char*)&sizes[0], sizeof(long)*nb_dims);
int64_t scalar_type = static_cast<int64_t>(tensor.scalar_type());
out_file.write((char*)&scalar_type, sizeof(scalar_type));
int64_t elem_size = tensor.element_size();
out_file.write((char*)&elem_size, sizeof(elem_size));
// writing tensor data
out_file.write((char*)tensor.data_ptr(), elem_size*tensor.numel());
}
}
In the read function you'll probably need to call torch::from_blob(void* data_ptr, const at::IntArrayRef& tensor_sizes, const at::TensorOptions& options) -> torch::Tensor
to deserialize the tensor, but otherwise it's the same structure
Edit : Just realized you can also make the tensor serialization much simpler with the save
and load
which convert to/from a stringstream (which is easy to read/write itself). See there