I am a beginner using stan (and stackoverflow by the way, I have absolutely no idea how you pretty print a dataframe on it, could not find how, sorry).
Let's say I want to make the following model :
y ~ normal(I*S + P*J,sigma)
P ~ normal(dP,1)
(to simplify this example I fix the deviation around P at 1)
where on one hand, I is a n,p matrix predictor and S are the corresponding regression coefficients (size p)
and on the other hand dP and J can only have 3 different values but my dataframe is constructed such as they looks like this (R-project):
dp <- c(0,0,0,0,0,0,0,2,2,2,2,2,2,2,1,1,1,1,1,1)
J <- c(5.2,5.2,....,2.3,2.3,....,7.5,7.5,...)
parameters are S, P and sigma.
I do no want stan to change every components of P, dp represents 3 types of data and I only want three different values of P corresponding to the 3 differents values of dp. However each row of my dataframe contains different values of I.
edit: said in another way: for each row k, I want :
y[k] ~ I[k,1]*S[1]+I[k,2]*S[2]...+ real_value_P * J[k]
How can I achieve that ?
Here is my code:
data {
int < lower = 1 > NR; // Number of rows
int < lower = 1 > NC; // Number of columns
matrix [NR,NC] I ;// Predictor I
vector [NR] dP;
vector [NR] J ;
vector [NR] y; // Outcome
}
parameters {
real < lower = 0 > sigma; // Error SD
vector [NC] S ;
vector [NC] P ;
}
model {
P ~ normal(dP,1)
y ~ normal(I*S+P*J,sigma) ;
}
I am not sure I have been really clear, stats are still a tough subject to me either and my model is a bit more complicated than presented.
Thanks
The "trick" seems to indicate in a vector ("indices" here) which value of P to take (1,1,1,1,1,...2,2,2,2,2,...,3,3,3,3,3) and then to loop over it in the parameters to assign the correct value :
transformed parameters {
vector [NR] JP ; //J*P
for (k in 1:NR){
JP[k]=True_P[indices[k]]*J[k] ;
}
Hence the complete code :
data {
int < lower = 1 > NR; // Number of rows
int < lower = 1 > NC; // Number of columns
matrix [NR,NC] I ;// Predictor I
int indices[NR]; // indices
vector [3] P ;
vector [NR] dP ;
vector [NR] J ;
vector [NR] y; // Outcome
}
parameters {
real < lower = 0 > sigma; // Error SD
vector < lower = 0 > [NC] S ; // regression coefficients for predictors I
vector [2] True_P ;
}
transformed parameters {
vector [NR] JP ; //J*P
for (k in 1:NR){
JP[k]=True_P[indices[k]]*J[k] ;
}
}
model {
for (k in 1:3){
P[k] ~ normal(True_P[k],1) ;
}
y ~ normal(I*S+JP,sigma) ;
}
generated quantities {
} // The posterior predictive distributiondistribution