Search code examples
rstanrstan

How can I use RStan without for loops?


Is there a way to more efficiently perform the following calculations in RStan?

I have only provided the minimal amount of coded that is needed:

parameters {
  real beta_0;
  real beta_1;
}     
model {
  vector [n] p_i = exp(beta_0 + beta_1*x)/[1 + exp(beta_0 + beta_1*x)];
  y ~ bernoulli(p_i);
  /* Likelihood:
  for(i in 1:n){
    p_i[i] = exp(beta_0 + beta_1*x[i])/(1 + exp(beta_0 + beta_1*x[i]));
    y[i] ~ bernoulli(p_i[i]);
  */}
// Prior:
  beta_0 ~ normal(m_beta_0, s_beta_0);
  beta_1 ~ normal(m_beta_1, s_beta_1);
}

I obtain the following error message: "Matrix expression elements must be type row_vector and row vector expression elements must be int or real, but found element of type vector". If I use the for loop (which is commented out), the code works fine, but I would like to limit the use of for loops in my code. In the above code, x, is a vector of length n.

Another example:

parameters {
  real gamma1;
  real gamma2;
  real gamma3;
  real gamma4;
}
model {
// Likelihood:
  real lambda;
  real beta;
  real phi;
  for(i in 1:n){
    lambda = exp(gamma1)*x[n_length[i]]^gamma2;
    beta = exp(gamma3)*x[n_length[i]]^gamma4;
    phi = lambda^(-1/beta);
    y[i] ~ weibull(beta, phi);
  }
  //y ~ weibull(exp(gamma1)*x^gamma2, exp(gamma3)*x^gamma4); //cannot raise a vector to a power
// Prior:
  gamma1 ~ normal(m_gamma1, s_gamma1);
  gamma2 ~ normal(m_gamma2, s_gamma2);
  gamma3 ~ normal(m_gamma3, s_gamma3);
  gamma4 ~ normal(m_gamma4, s_gamma4);
}

The above code works, but the commented out likelihood calculation does not work since I "cannot raise a vector to a power" (but you can in R). I would, once again, like to not be forced to use for loops. In the above code, n_length, is a vector of length n.

A final example. If I want to draw 10000 samples from a normal distribution in R, I can simply specify

rnorm(10000, mu, sigma)

But in RStan, I would have to use a for loop, for example

parameters {
      real mu;
      real sigma;
    }
generated quantities {
  vector[n] x;
  for(i in 1:n) {
    x[i] = normal_rng(mu, sigma);
  }
}

Is there anything that I can do to speed up my RStan examples?


Solution

  • This line of code:

    vector [n] p_i = exp(beta_0 + beta_1*x)/[1 + exp(beta_0 + beta_1*x)];
    

    is not valid syntax in the Stan language because square brackets are only used for indexing. It could instead be

    vector [n] p_i = exp(beta_0 + beta_1*x) ./ (1 + exp(beta_0 + beta_1*x));
    

    which utilizes the elementwise division operator, or better yet

    vector [n] p_i = inv_logit(beta_0 + beta_1*x);
    

    in which case y ~ bernoulli(p_i); would work as a likelihood. Better still, just do

    y ~ bernoulli_logit(beta_0 + beta_1 * x);
    

    and it will do the transformation for you in a numerically stable fashion. You could also use bernoulli_logit_glm, which is slightly faster particularly with large datasets.

    In Stan 2.19.x, I think you can draw N values from a probability distribution in the generated quantities block. But you are too worried about for loops. The Stan program is transpiled to C++ where loops are fast and almost all of the functions in the Stan language that accept vector inputs and produce vector outputs actually involve the same loop in C++ as if you had done the loop yourself.