Search code examples
c++pointersbinaryfilesgmp

C++ mpz_class and binary files


I am using the mpz_class (using MPIR 2.5.1 with Visual Studio C++ 2010, and the C++ version of MPIR), and for me it's not feasible to store huge numbers in memory, so I want to do it with binary files.

I already finished this with text files, but when I use 100,000+ bit numbers, binary files should will (hopefully) save a lot of space.

I have written a short example to help you understand what I'm trying to do:

ofstream binFile;
binFile.open ("binary.bin", ios::out | ios::binary);

mpz_class test;
test.set_str("999999999999999",10);

binFile.write((char *)(&test), sizeof(test));

cout << "NUMBER: " << test << "\tSIZE: " << sizeof(test) << endl;
binFile.close();

I am trying to write the character-data representing the mpz_class instance. Then, to test it, I tried to read the file:

ifstream binFile2;
binFile2.open("binary.bin", ios::in | ios::binary);

mpz_class num1 = 0; 
binFile2.read ((char *)(&num1), sizeof(num1));

cout << "NUMBER: " << num1 << "\tSIZE: " << sizeof(num1) << endl;
binFile2.close();

Many examples I see online use this method for storing class data into binary files, but my output is this:

NUMBER: 999999999999999 SIZE: 12

NUMBER: 8589934595      SIZE: 12

Why can't I store class data directly, and then read it again? There is no way the instance of mpz_class can be size 12, is this the size of the pointer??

I have also tried this, but I think it's basically the same thing:

char* membuffer = new char[12]; //sizeof(test) returned 12
binFile2.read (membuffer , sizeof(test));
memcpy(&test, &membuffer, sizeof(test))

Any advice on how to fix this would be appreciated. Thanks.


Solution

  • I think you need to spend more time with the GMP manual (section 12.1):

    Conversions back from the classes to standard C++ types aren’t done automatically, instead member functions like get_si are provided (see the following sections for details).

    So, what you probably need to do is call mpz_class::get_str and mpz_class::set_str. Anyway, the C++ interface is just a light wrapper around the C API, so you're probably better off using the low-level stuff, since it's much better documented. In this case, you would have to use mpz_get_str and mpz_set_str (for integers).

    Just keep in mind that there's no API function that can provide a direct binary serialization of the GMP data types, so you need to work with strings. I'm not sure if there are certain limitations to the size of these beasts, so you should test your code thoroughly if you plan to make use of such large numbers. Maybe the best choice is to extract a string representation in base 62 (maximum allowed) so that it doesn't blow up your memory (in base 2 it will eat up one byte for every bit) and then write that to file.