Search code examples
c++boostprobabilityboost-randomdynamic-resizing

Is boost::random::discrete_distribution dynamically resizable?


I cannot find much documentation of the Boost version of discrete_distribution. After much Google searching, I still can't even find a list of methods that this class has, and whether any of them function to re-assign the probabilities.

In my case, I am writing an evolutionary dynamics algorithm. At each time step, members of the population can be randomly selected to either die or reproduce. Because of this, the total number of entries within my discrete distribution will change almost every iteration.

I want to have a single object that I define before the simulation starts, called gillespie_dist (the discrete distribution governing this Gillespie algorithm). But I want, at the end of each iteration, to potentially change specific values and/or add new values to gillespie_dist and specifically do not want to create new instances of the discrete_distribution every iteration.

What is a good approach for this. Are there methods for pushing a new value onto the discrete_distribution object, methods for changing a distribution's value at a particular index, or better yet, somehow "re-initializing" the entire distribution using the vector iterator idea mentioned here?


Solution

  • I looked into the code of the gcc libstdc++ 4.7 implementation of std::discrete_distribution.

    The weights are stored as a vector<double> in a private member. There is no access to its resize method in the public interface.

    I'll try and dig out the implementation of its operator() (which is in the cpp it looks like), it should be no trouble to roll your own.

    Here is the main action, and my explanation following:

      template<typename _IntType>
        void
        discrete_distribution<_IntType>::param_type::
        _M_initialize()
        {
          if (_M_prob.size() < 2)
            {
              _M_prob.clear();
              return;
            }
    
          const double __sum = std::accumulate(_M_prob.begin(),
                                               _M_prob.end(), 0.0);
          // Now normalize the probabilites.
          __detail::__transform(_M_prob.begin(), _M_prob.end(), _M_prob.begin(),
                              std::bind2nd(std::divides<double>(), __sum));
          // Accumulate partial sums.
          _M_cp.reserve(_M_prob.size());
          std::partial_sum(_M_prob.begin(), _M_prob.end(),
                           std::back_inserter(_M_cp));
          // Make sure the last cumulative probability is one.
          _M_cp[_M_cp.size() - 1] = 1.0;
        }
    
      template<typename _IntType>
        template<typename _UniformRandomNumberGenerator>
          typename discrete_distribution<_IntType>::result_type
          discrete_distribution<_IntType>::
          operator()(_UniformRandomNumberGenerator& __urng,
                     const param_type& __param)
          {
            if (__param._M_cp.empty())
              return result_type(0);
    
            __detail::_Adaptor<_UniformRandomNumberGenerator, double>
              __aurng(__urng);
    
            const double __p = __aurng();
            auto __pos = std::lower_bound(__param._M_cp.begin(),
                                          __param._M_cp.end(), __p);
    
            return __pos - __param._M_cp.begin();
          }
    

    So basically it calculates an auxilary vector _M_cp at initialization time which is essentially a discete cummulative density function of the weights. So producing a sample just means generating a uniform random variable and searching for the first occurence of that in the cummulative distribution (this is the lower_bound call above), returning its index.

    eg, if the weights vector is:

    { 1, 2, 1, 3 }
    

    then the cp is calculated as:

    { 1, 1+2, 1+2+1, 1+2+1+3 }
    =
    { 1, 3, 4, 7 }
    

    so I uniformly select from 0..6 and get 4, so I pick the third one.