Assume I have a multilevel structure of data. With a global distribution, from which I draw a highlevel distribution from which i draw a lowlevel distribution from which I draw my response variable. How would I implement such a thing in a stan model.
Below is a minimal example which I hope illustrates the problem. In the stan code there is
I am quite inexperienced with modeling and all tutorials on ML-Modeling do not have the Loop-Approach I though would make sense here, so I suspect I am completely heading in the wrong direction with that. So any help will be highly appreciated.
R-Code to generate and run the model
```{r}
library(ggplot2)
library(rstan)
library(bayesplot)
library(loo)
set.seed(0)
numberOFTest <- 50
resposeList <- c()
highLevelIndexList <- c()
lowLevelIndexList <- c()
for (highLevelIndex in seq(3)) {
highLevelMu <- rnorm(1,0,5) #High Level avarage drawn drom normal dist
for (lowLevelIndex in seq(6)) {
lowLevelMu <- rnorm(1,highLevelMu,3) #low Level avarage drawn drom normal dist
resposeList <- c(resposeList, rnorm(numberOFTest,lowLevelMu,1))
highLevelIndexList <- c(highLevelIndexList, rep(highLevelIndex,numberOFTest) )
lowLevelIndexList <- c(lowLevelIndexList, rep(paste0(highLevelIndex,lowLevelIndex),numberOFTest) )
}
}
lowLevelIndexList <- as.integer( as.factor(lowLevelIndexList))
dat <- list(Nr=length(resposeList),
Nh=length(unique(highLevelIndexList)),
Nl=length(unique(lowLevelIndexList)),
#############
r=seq(length(resposeList)),
response=resposeList,
highLevelIndex=(highLevelIndexList),
lowLevelIndex=(lowLevelIndexList)
)
model <- stan(data=dat, file="modle.stan",chains = 4, cores = 8, control=list(adapt_delta=0.95),iter = 2000)
The stan code
data {
int<lower=1> Nr; // number of rows (entries)
int<lower=1> Nh; // number of highlevels
int<lower=1> Nl; // number of low levels
int<lower=1> r[Nr]; // rownumber
real response[Nr]; //
int<lower=1> highLevelIndex[Nr]; //
int<lower=1> lowLevelIndex[Nr]; //
}
parameters {
real<lower=0> sigma; //
real<lower=0> sigma_global; //
real<lower=0> sigma_hL; //
real<lower=0> sigma_lL; //
real a_global; //
real a_hL[Nh]; //
real a_lL[Nl]; //
}
transformed parameters {
vector[Nr] mu;
for (rowIndex in 1:Nr) { //
mu[rowIndex] = a_lL[lowLevelIndex[rowIndex]];
}
}
//Idea of what I want
model {
sigma_global ~ exponential(0.1); //hyper
sigma_hL ~ exponential(0.1); //hyper
sigma_lL ~ exponential(0.1); //hyper
a_global ~ normal(0,sigma_global);
//-----------
for (rowIndex in 1:Nr) { //
a_hL[highLevelIndex[rowIndex]] ~ normal(a_global, sigma_hL); //hyper
a_lL[lowLevelIndex[rowIndex]] ~ normal(a_hL[highLevelIndex[rowIndex]], sigma_lL);
}
////
sigma ~ exponential(0.1);
response ~ normal(mu, sigma);
}
/*
model {///// Working but not what I want
a_lL ~ normal(0, 10);
////
sigma ~ exponential(0.1);
response ~ normal(mu, sigma);
}
*/
generated quantities { // for the PPC and loo
real n_rep[Nr];
vector[Nr] log_lik;
for (i in 1:Nr){
n_rep[i] = normal_rng(mu[i], sigma);
log_lik[i] = normal_lpdf(response[i] | mu[i], sigma);
}
}
mcmc_areas of the working modle with no shrinkage within each block (1-6),(7-12) and (12-18)
found the mistake: I needed to map the lowlevel values to the highlevel ones, with a look up table. Below is now a working version, which also just takes a second to finish.
data {
int<lower=1> Nr; // number of rows (entries)
int<lower=1> Nh; // number of highlevels
int<lower=1> Nl; // number of low levels
real response[Nr]; //
int<lower=1> highLevelIndex[Nr]; //
int<lower=1> lowLevelIndex[Nr]; //
int<lower=1> lookUp[Nl];//lookuptable which highlevel value a lowlevel value fits to
}
parameters {
real<lower=0> sigma; //
real<lower=0> sigma_hL; //
real<lower=0> sigma_lL; //
real a_hL[Nh]; //
real a_lL[Nl]; //
}
transformed parameters {
vector[Nr] mu;
for (rowIndex in 1:Nr) { //
mu[rowIndex] = a_lL[lowLevelIndex[rowIndex]];
}
}
model {
sigma_hL ~ exponential(0.01);
a_hL ~ normal(0, sigma_hL);
sigma_lL ~ exponential(0.01);
for (lLIdx in 1:Nl) { //
a_lL[lLIdx] ~ normal(a_hL[lookUp[lLIdx]],sigma_lL);
}
////
sigma ~ exponential(0.1);
response ~ normal(mu, sigma);
}
generated quantities { // for the PPC and loo
real n_rep[Nr];
vector[Nr] log_lik;
for (i in 1:Nr){
n_rep[i] = normal_rng(mu[i], sigma);
log_lik[i] = normal_lpdf(response[i] | mu[i], sigma);
}
}