Search code examples
rstanrstan

Errors while trying to write joint likelihood of a normal_lpdf and a bernoulli_lpmf in Stan


I am trying to calculate Bayes Factor using the package bridgesampling in R. For this I am trying to fit the model in Stan. It is a hierarchical model with the Alternative (Hypothesis) model having parameters beta, gamma1, gamma3 & sigma.

Please find the model below:

stanmodelH1 = 'data {
	int<lower=0> n;
	int<lower=0> k;
	vector[n] y;
	int x1[n];
	real<lower=0> a;
	real<lower=0> b;
	matrix[n, k] G;
	matrix[n, k+1] X1;
	matrix[n, 2] x;
}

parameters {
	vector[2] beta;
	vector[k+1] gamma3;
	vector[k] gamma1;
	real<lower=0> sigma;
}

model {
	target += inv_gamma_lpdf(sigma | a, b);
	target += normal_lpdf(gamma1 | 0, sqrt(1.1));
	target += normal_lpdf(gamma3 | 0, 1);
	target += normal_lpdf(beta | 0, sqrt(1.2));
  target += normal_lpdf(y | x*beta + G*gamma1, sqrt(sigma)) + bernoulli_lpmf(x1 | Phi(X1*gamma3));
  
  //these two _lpdf and _lpmf should be added and not multiplied. This is the answer.
}
'

and the corresponding rstan code:

stanfitmodelH1 = sampling(stanmodelH1, data = list(n = n, k = k, y = y, x1 = x1, 
                                                   a = 4, b = 3, G = G, X1 = X1, x = x),
                          iter = 50000, warmup = 20000, chains = 3, cores = 3,
                          control = list(adapt_delta = 0.99, max_treedepth = 15))

Now, when I sample from this model in Stan; it throws the following errors:

Warning messages:
1: There were 205 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
2: There were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low 
3: Examine the pairs() plot to diagnose sampling problems

Please note that, similar errors are thrown when sampling from the null model. Where basically beta = beta0 = c(rnorm(1), 0) and gamma3 = rep(0, k+1) in contrast to the no-restriction in the alternate model.

These are clearly some random errors occurring due to some (I do not know what and I really want to find out) sampling problems in Stan. I know this because I ran the model (and the null model as well) several times without the following part: bernoulli_lpmf(x1 | Phi(X1*gamma3)) in the last line of the Stan model and modifying the data block of the code accordingly and then it returned no warnings or errors. But, this will not solve my purpose. Because, my hypothesis involves both beta & gamma3 and the above portion clearly adds the gamma3.

As a result of the errors in Stan, the function bridge_sampler from the R package bridgesampling is returning the following errors:

> H1.bridge = bridge_sampler(stanfitmodelH1, silent = TRUE)
Error in while (i <= maxiter && criterion_val > tol) { : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
2744 of the 45000 log_prob() evaluations on the proposal draws produced -Inf/Inf. 

Also note that, these errors (from bridge_sampler) are not thrown when there are no errors from Stan. e.g., the case where the model is run without the bernoulli_lpmf(x1 | Phi(X1*gamma3)) part.

I tried to best explain my problem. Please ask me should any more clarification is needed.

Could someone please identify where exactly the error(s) is/are occurring?

Thank you very much in advance!


Solution

  • I was making a silly mistake all along! As pointed out by Ben Goodrich Sir the logarithms should be added instead of being multiplied as they are in the logarithmic scale. A very silly mistake from my side! I will edit the question with the correct code! Thanks to you all!