I understand I can calculate the log likelihood of each sample during sampling, e.g.
...
model {
for (i in 1:N) {
(y[i] - 1) ~ bernoulli(p[i, 2]);
}
}
generated quantities {
vector[N] log_lik;
for (i in 1:N){
log_lik[i] = bernoulli_lpmf((y[i] - 1) | p[i, 2]);
}
}
After fitting, I can then extract log likelihood using the loo
package:
log_lik_m <- extract_log_lik(stan_fit)
But I want to evaluate log likelihood of unseen data. This is possible in brms:
ll <- log_lik(fit_star, newdata = new_df)
But I would like to do this with rstan
, since I can't easily define my model in brms (I am assuming).
For reference, I am trying to use Estimated LFO-CV to evaluate and compare my time-series model. (e.g. https://github.com/paul-buerkner/LFO-CV-paper/blob/master/sim_functions.R#L186)
Thanks to the link from @dipetkov, I solved this myself. I didn't use the exact methods in the link, but came up with an alternative. You can call stan functions from R to get it to compute log likelihood for your model, even with unseen data (and its very fast!).
First, I put everything in my transformed parameters
block into a function in stan's functions
block. Then, I created a second function that wraps the first function, and evaluates the log likelihood for given observations and provided parameter estimates (I then removed my generated_quantities
block). rstan
has a function expose_stan_functions
which adds all functions in the stan functions
block to the R environment.
You can then call the log likelihood function you made to evaluate your model with any observations (previously seen or unseen), along with a set of parameter estimates.