Search code examples
pythonmcmc

Markov Chain Monte Carlo integration and infinite while loop


I'm implementing a Markov Chain Monte Carlo with both Metropolis and Barker's α's for numerical integration. I've created a class called MCMCIntegrator(). Below the __init__() method, are the g(x) the PDF of the function I'm integrating and alpha method, implementing the Metropolis and Barker α's.

import numpy as np
import scipy.stats as st

class MCMCIntegrator:

    def __init__(self):
        self.size = 1000
        self.std = 0.6
        self.real_int = 0.06496359
        self.sample = None

    @staticmethod
    def g(x):
        return st.gamma.pdf(x, 1, scale=1.378008857)*np.abs(np.cos(1.10257704))

    def alpha(self, a, b, method):

        if method:
            return min(1, self.g(b) / self.g(a))

        else:
            return self.g(b) / (self.g(a) + self.g(b))

The size is the size of the sample that the class must generate, std is the standard deviation of the normal kernel, which you will see in a few seconds. The real_int is the value of the integral from 1 to 2 of the function we're integrating. I've generated it with a R script. Now, to the problem.

     def _chain(self, method):

        """
            Markov chain heat-up with burn-in

        :param method: Metropolis or Barker alpha
        :return: np.array containing the sample
        """
        old = 0
        sample = np.zeros(int(self.size * 1.3))
        i = 0

        while i != len(sample):
            new = np.random.normal(loc=old, scale=self.std)
            new = abs(new)
            al = self.alpha(old, new, method=method)
            u = np.random.uniform()

            if al > u:
                sample[i] = new
                i += 1
                old = new

        return np.array(sample)

Below this method is an integrate() method that calculates the proportion of numbers in the [1, 2] interval:

    def integrate(self, method=None):
        """
            Integration step

        """

        sample = self._chain(method=method)

        # discarding 30% of the sample for the burn-in
        ind = int(len(sample)*0.3)
        sample = sample[ind:]
        setattr(self, "sample", sample)

        sample = [1 if 1 < v < 2 else 0 for v in sample]
        return np.mean(sample)

this is the main function:

def main():

    print("-- RESULTS --".center(20), end='\n')
    mcmc = MCMCIntegrator()
    print(f"\t{mcmc.integrate()}", end='\n')
    print(f"\t{np.abs(mcmc.integrate() - mcmc.real_int) / mcmc.real_int}")


if __name__ == "__main__":
    main()

I'm stuck in an infinite while loop, with no idea why this is happening.


Solution

  • While I have no prior exposure to Python and no direct explanation for the infinite loop, here are some problematic issues in the code:

    1. The

       while i != len(sample):
      

      loop increments the value i only when the Uniform variate is below the acceptance probability if al > u: This is not how Metropolis-Hastings operates. If the Uniform variate is above the acceptance probability, the current value of the chain is duplicated. but this does not explain for the infinite loop since a proposed value should eventually be accepted.

    2. If the target density is

      st.gamma.pdf(x, 1, scale=1.378008857)*np.abs(np.cos(1.10257704))  
      

      then (i) what is the point of the second and constant term np.abs(np.cos(1.10257704)) and (ii) where is the need for so many digits?

    3. The proposal distribution

         new = np.random.normal(loc=old, scale=self.std)
         new = abs(new)
      

      is a folded normal, which density is not symmetric. Hence it should appear in the Metropolis-Hastings probability but may have little impact since the scale is small.

    Here is my R rendering of the Python code (edited and corrected)

    self.size = 1e5
    self.std = 0.6
    self.real_int = 0.06496359
    
    g <- function(x){dgamma(x, shape=1, scale=1.378)}
    
    alpha <- function(a, b, method=1)ifelse(method,
                min(1, r <- g(b) / g(a)), 1 / (1 + 1 / r))
    
    old = 0 
    smple = rep(0,self.size * 1.3)
    for (i in 1:length(smple)){
                new = abs(old+self.std*rnorm(1))
                al = alpha(old, new, 0)
                old=smple[i]=ifelse(al > runif(1), new, old)
                 }
    
    ind = trunc(length(smple)*0.3)
    smple = sample[ind:length(smple)]
    
    hist(smple,pro=TRUE,nclass=10*log2(self.size),col="wheat")
    curve(g(x),add=TRUE,lwd=2,col="sienna")
    

    clearly reproducing the Gamma target:

    enter image description here

    without the correction for the non-symmetric proposal. The correction would be

    q <- function(a, b)
            dnorm(b-a,sd=self.std)+dnorm(-b-a,sd=self.std)
    
    alpha <- function(a, b, method=1){
            return(ifelse(method,
                min(1, r <- g(b) * q(b,a) / g(a) / q(a,b)),
                1 / (1 + 1/r)))}
    
    old = 0 
    smple = rep(0,self.size * 1.3)
    for (i in 1:length(smple)){
                new = abs(old+self.std*rnorm(1))
                al = alpha(old, new, 3)
                old=smple[i]=ifelse(al > runif(1), new, old)
                 }
    

    and makes no difference in the above picture. (The acceptance rate for the Metropolis ratio is 85%, while for Baker's, it is 48%.)