Search code examples
c++c++11c++17std-functionstdbind

std::function works beautifully with std::bind - but why?


I was using a std::uniform_int_distribution to generate primes (p). I put the distribution object in an anonymous namespace - which seems like C++ 'static linkage' for grown-ups...

namespace
{
    // a more pedantic range: [2, 18446744073709551557]
    std::uniform_int_distribution<uint64_t> p_dist {2};

    std::mt19937 rng; // (sufficient IV states for uniqueness)
}

Note, that I seed the Mersenne Twister as thoroughly as portable code will allow. This isn't really important to the question though. It's just to assure the reader I'm using the random facilities properly:

std::seed_seq::result_type data[rng.state_size];
std::random_device rdev;

std::generate_n(data, rng.state_size, std::ref(rdev));
std::seed_seq rng_seed (data, data + rng.state_size);
rng.seed(rng_seed);

This is very convenient, as I have a deterministic u64_prime(p) function, using (7) bases, that can determine if (p) is prime:

uint64_t p;
while (!u64_prime(p = p_dist(rng)))
    ;

Now I create a std::function object:

std::function<uint64_t()> zp_rng = std::bind(
    decltype(p_dist){0, p - 1}, std::ref(rng));

This function: zp_rng() can be invoked to return a random number in Z(p). That is, using the distribution object for: [0, p - 1] from the results of the referenced rng.


Now this is very impressive - but I've effectively adopted it by cut-and-paste with little understanding of the interaction between std::function and the interaction of the parameters given to std::bind.

I'm not confused by decltype(p_dist){0, p - 1} - that's just a way to specify we still want to use a std::uniform_int_distribution. My understanding of std::ref(rng) is that it prevents a local copy of the rng being instantiated, and forces the use of reference instead... so:


Q: What are the basic rules that effectively determine: dist(rng) being used - I don't see why std::bind would enforce this interaction. A lot of interactions seem based around operator () methods.

Q: std::function is helpfully referred to as 'a general-purpose polymorphic function wrapper' on cppreference.com. So is it a function that encapsulates a uint64_t return type? Or again, making use of operator () syntax to drive the notion of a function?

As incredibly useful as these constructs are, I feel like I'm cargo-cult programming to a degree here. I'm looking for an answer which resolves any ambiguities in a concrete way, and adds insight to similar questions - how are bind arguments expected to interact, and how does the function signature reflect that?


I'm not getting any positive feedback about the use of std::bind. Plenty on the superior results (and code generation) of simply using lambda functions, even in such a simple case. My own tests validate this.


Solution

  • Q: What are the basic rules that effectively determine: dist(rng) being used - I don't see why std::bind would enforce this interaction. A lot of interactions seem based around operator () methods.

    std::bind performs function composition. The first argument must be a function object, i.e. something callable like a function (e.g. a normal function, or a class with an overloaded operator()).

    A call to std::bind makes copies of its arguments, "binds" the copies of the arguments to the first argument (the function object), and returns a new function object that will invoke the copy of the function object.

    So in a simple case:

    int f(int i) { return i; }
    auto f1 = std::bind(f, 1);
    

    this binds the value 1 to the function f, creating a new function object that can be called with no arguments. When you invoke f1() it will invoke f with the argument 1, i.e. it will call f(1), and return whatever that returns (which in this case is just 1).

    The actual type of the thing returned by std::bind(f, 1) is some implementation-specific class type, maybe called something like std::__detail::__bind_type<void(*)(int), int>. You're not meant to refer to that type directly, you would either capture the object using auto or store it in something else that doesn't care about the precise type, so either:

    auto f1 = std::bind(f, 1);
    

    or:

    std::function<int()> f1 = std::bind(f, 1);
    

    In your more complex case, when you call std::bind(decltype(p_dist){0, p - 1}, std::ref(rng))) you get a new function object that contains a copy of the temporary decltype(p_dist){0, p - 1} and a copy of the reference_wrapper<std::mt19937> created by std::ref(rng). When you invoke that new function object it will call the contained distribution, passing it a reference to rng.

    That means it is callable with no arguments, and it will call the contained random number distribution with the contained random engine, and return the result.

    Q: std::function is helpfully referred to as 'a general-purpose polymorphic function wrapper' on cppreference.com. So is it a function that encapsulates a uint64_t return type?

    A std::function<uint64_t()> is a wrapper for a function object that is callable with no arguments and that returns a uint64_t (or something implicitly convertible to uint64_t). It can be used to store a copy of an arbitrary function object that is copy constructible, and callable with no arguments, and returns something convertible to uint64_t.

    (And more generally, a std::function<R(A1, A2 ... AN)> is a wrapper for a function object that returns R when called with N arguments, of types A1, A2 ... AN.)

    Since the result of your std::bind call is a copy constructible function object that is callable with no arguments and returns a uint64_t, you can store the result of that std::bind call in a std::function<uint64_t()>, and when you invoke the function it will invoke the result of the bind call, which will invoke the contained distribution, and return the result.

    Or again, making use of operator () syntax to drive the notion of a function?

    I'm not sure what this means, but in general it's true that in C++ we often talk about "function objects" or "callables" which are generalisations of functions, i.e. something that can be invoked using function call syntax, e.g. a(b, c)