Using rstan
, I am running a code that uses 4 cores in parallel. I have access to a computer with 32 cores and I need to run 3 instances of the same code on different datasets, and another 3 instances of a slightly different code on the same datasets, for a total of 6 models. I'm having a hard time figuring what is the best way to accomplish this. Ideally, the computer would be running 4 cores on each model for a total of 24 cores running at a time.
I've used the parallel
package many times before but I don't think it can handle this kind of "parallel in parallel". I am also aware of the Jobs feature in RStudio but one of the good things about rstan
is that it interactively shows you how the chains progress, so ideally I would like to be able to see these updates. Can this be accomplished by having 6 different RStudio sessions open at once? I tried running two at a time but I'm not sure if they run in parallel to each other as well, so any clarification would be great.
I would suggest using batch jobs instead. In principle, since you don't have that many models, you could simply try writing 9 different R scripts and store them as, e.g., model1.R, model2.R, ..., model6.R. With that, you could then try submitting the jobs in the command line like this:
R CMD BATCH --vanilla model1.R model1.Rout &
This will run the first script in batch mode and output the stdout to a log file, model1.Rout. That way, you can inspect the state of the jobs by just opening that file. Of course, you will need to run the above command for each model.