Search code examples
c++gmp

GMP for C++: more troubles with `auto`


Here are more about strange behaviour of GMP C++ Class Interface, related to the C++ auto keyword. My previous question was also about this keyword, and its removal solved my problem. However it was compilation problem, but now it's runtime one.

The simple test below compiles OK but coredumps.

#include <iostream>
#include <gmpxx.h>

auto test()
{
  return mpz_class(2) * 2;
}

int main()
{
  std::cout << test() << std::endl;
}

If to replace the keyword auto by the mpz_class, then the test will run correctly. Also - if to omit the compilation flag -O3, then the test will run, but will print a wrong number, which will vary with each execution (which usually happens when the execution depends on random uninitialised data).

It looks like the gmpxx.h header contains something, which makes the C++ compiler to generate wrong code in this case. How to neutralize this something?

A possible answer would be "Don't use the auto return type", however I prefer to use it - that simplifies modifications.

The system environment:

  • OS: Ubuntu 22.04.4 LTS
  • Compiler: g++ (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0
  • Compiler/linker flags: -O3 -Wall -std=c++20 -lgmpxx -lgmp
  • GMP version: 6.2.1

Solution

  • The documentation for the GMP C++ bindings is quite clear: avoid using auto for GMP expressions. (In fact, even ordinary templates can cause problems).

    The issue is that GMP relies on the "expression template" technique. For mpz_class a, b, this means that a + b is not an mpz_class itself, but rather some other type of object that represents the sum of a and b without actually computing it. Such an object will typically refer to a and b by reference. The object can be implicitly converted to mpz_class. The idea is that this technique allows functions to be specialized depending on the functions that produced their operands. E.g. here's a toy example:

    struct my_num { double num; };
    template<typename... Ts>
    struct my_num_sum { // expression template representing sums
        std::tuple<Ts&...> summands;
        operator my_num() {
            // pretend that there is a more efficient method to add together multiple values of my_num at once than just repeatedly using +
            return std::apply([](Ts&... summands) { return my_num((0 + ... + my_num(summands).num)); }, summands);
        }
    };
    auto operator+(my_num &a, my_num &b) {
        return my_num_sum<my_num, my_num>(std::tie(a, b));
    }
    // this specializes the addition operator to do something different (i.e. more efficient) when one of the operands is itself a sum
    template<typename... Ts, typename U>
    auto operator+(my_num_sum<Ts...> as, U &b) {
        return my_num_sum<Ts..., U>(std::tuple_cat(as.summands, std::tie(b)));
    }
    

    In this example, for my_num a, b, c, the expression a + b + c is not a my_num but rather a my_num_sum<my_num, my_num, my_num>. The line my_num d = a + b + c adds up a, b and c all at once (in one call to my_num_sum::operator my_num()), which can be more efficient than adding up a and b into an intermediate object and then adding c to that. (This isn't true for doubles, but it does make sense for larger types like vectors).

    You see there's a big danger now: a user function can easily produce dangling references:

    auto oops(my_num a, my_num b) { return a + b; }
    // returns a my_num_sum containing dangling references to parameters!
    

    But the problem goes away if you ask for an actual number, since now the implicit conversion from the expression template to the actual data type is used.

    my_num fine(my_num a, my_num b) { return a + b; }
    

    If you really insist on using auto return types, fine, but now you still need to add casts to my_num in places where you don't want expression templates to leak.

    auto okay(my_num a, my_num b) { return my_num(a + b); }
    

    These are exactly the two fixes suggested by the GMP documentation for your problem. Write either one of the following to define test properly:

    mpz_class test() { return mpz_class(2) * 2; }
    auto test() { return mpz_class(mpz_class(2) * 2); }
    

    You have to say mpz_class somewhere because otherwise there will be nothing to trigger the implicit conversion from expression template to number.