Search code examples
c++boost-random

Behavior of discard for boost random number generators


I am wrapping boost random number generators with an adapter class to implement a Monte Carlo routine. When writing unit tests on the member functions of the class, I assumed that the behavior of .discard(unsigned int N) is to draw N random numbers without storing them, thus advancing the state of the rng. The boost code is :

void discard(boost::uintmax_t z)
{
    if(z > BOOST_RANDOM_MERSENNE_TWISTER_DISCARD_THRESHOLD) {
        discard_many(z);
    } else {
        for(boost::uintmax_t j = 0; j < z; ++j) {
            (*this)();
        }
    }
}

which supports my assumption. However, I find that the sequence that results from .discard(1) is not one number different than the same sequence without discard. The code:

#include <iostream>
#include <iomanip>
#include <random>
#include <boost/random.hpp>

int main()
{
    boost::mt19937 uGenOne(1);
    boost::variate_generator<boost::mt19937&, boost::normal_distribution<> > distOne(uGenOne, boost::normal_distribution<>());

    boost::mt19937 uGenTwo(1);
    boost::variate_generator<boost::mt19937&, boost::normal_distribution<> > distTwo(uGenTwo, boost::normal_distribution<>());
    distTwo.engine().discard(1);

    unsigned int M = 10;
    std::vector<double> variatesOne(M);
    std::vector<double> variatesTwo(M);

    for (unsigned int m = 0; m < M; ++m) {
        variatesOne[m] = distOne();
        variatesTwo[m] = distTwo();
    }

    for (unsigned int m = 0; m < M; ++m)
        std::cout << std::left << std::setw(15) << variatesOne[m] << variatesTwo[m] << std::endl;

    return 0;
}

outputs

2.28493        0.538758  
-0.668627      -0.0017866
0.00680682     0.619191  
0.26211        0.26211   
-0.806832      -0.806832 
0.751338       0.751338  
1.50612        1.50612   
-0.0631903     -0.0631903
0.785654       0.785654  
-0.923125      -0.923125

Is my interpretation of how .discard operates incorrect? Why are the two sequences different in the first three outputs, and then identical?

(This code was compiled on msvc 19.00.23918 and g++ 4.9.2 on cygwin, with identical results).


Solution

  • The issue here seems to be that the engine is not being modified correctly or the distrobution is adding doing some extra work. If we use the engine directly like

    int main()
    {
        boost::mt19937 uGenOne(1);
    
        boost::mt19937 uGenTwo(1);
        uGenTwo.discard(1);
    
        unsigned int M = 10;
        std::vector<double> variatesOne(M);
        std::vector<double> variatesTwo(M);
    
        for (unsigned int m = 0; m < M; ++m) {
            variatesOne[m] = uGenOne();
            variatesTwo[m] = uGenTwo();
        }
    
        for (unsigned int m = 0; m < M; ++m)
            std::cout << std::left << std::setw(15) << variatesOne[m] << variatesTwo[m] << std::endl;
    
        return 0;
    }
    

    It produces

    1.7911e+09     4.28288e+09
    4.28288e+09    3.09377e+09
    3.09377e+09    4.0053e+09
    4.0053e+09     491263
    491263         5.5029e+08
    5.5029e+08     1.29851e+09
    1.29851e+09    4.29085e+09
    4.29085e+09    6.30312e+08
    6.30312e+08    1.01399e+09
    1.01399e+09    3.96591e+08
    

    Which is a 1 shifted sequence as we expect since we discarded the first output.

    So you are correct in how discard works. I am not exactly sure why there is a discrepancy when you do it through the boost::variate_generator. I cannot see why the first three numbers differ but all of the rest of the output matches.