Search code examples
rloopsforeachsequentialstochastic

%dopar% or alternative method to speed up sequential stochastic calculation


I have written a stochastic process simulator but I would like to speed it up since it's pretty slow.

The main part of the simulator is made of a for loop which I would like to re-write as a foreach with `%dopar%.

I have tried doing so with a simplified loop but I'm running into some problems. Suppose my for loop looks like this

library(foreach)

r=0
t<-rep(0,500)
for(n in 1:500){
    s<-1/2+r
    u<-runif(1, min = 0, max = 1)
    if(u<s){
        t[n]<-u
        r<-r+0.001
    }else{r<-r-0.001}
}

which means that at each iteration I update the value of r and s and, in one of the two outcomes, populate my vector t. I have tried several different ways of re-writing it as a foreach loop but it seems like with each iteration my values don't get updated and I get some pretty strange results. I have tried using return but it doesn't seem to work!

This is an example of what I have come up with.

rr=0
tt<-foreach(i=1:500, .combine=c) %dopar% {
    ss<-1/2+rr
    uu<-runif(1, min = 0, max = 1)
    if(uu<=ss){
        return(uu)
        rr<-rr+0.001
    }else{
        return(0)
        rr<-rr-0.001}
}

If it is impossible to use foreach what other way is there for me to re-write the loop so to be able to use all cores and speed up things?


Solution

  • Since your comments, about turning to C, were encouraging and -mostly- to prove that this isn't a hard task (especially for such operations) and it's worth looking into, here is a comparison of two sample functions that accept a number of iterations and perform the steps of your loop:

    ffR = function(n) 
    {
       r = 0
       t = rep(0, n)
       for(i in 1:n) {
           s = 1/2 + r
           u = runif(1)
           if(u < s) {
               t[i] = u
               r = r + 0.001
           } else r = r - 0.001
       }
    
       return(t)
    }
    
    
    ffC = inline::cfunction(sig = c(R_n = "integer"), body = '
        int n = INTEGER(AS_INTEGER(R_n))[0];
    
        SEXP ans;
        PROTECT(ans = allocVector(REALSXP, n));
    
        double r = 0.0, s, u, *pans = REAL(ans);
    
        GetRNGstate();
    
        for(int i = 0; i < n; i++) {
            s = 0.5 + r;
            u = runif(0.0, 1.0);
    
            if(u < s) {
                pans[i] = u;
                r += 0.001;
            } else {
                pans[i] = 0.0;
                r -= 0.001;
            }
        }
    
        PutRNGstate();
    
        UNPROTECT(1);
        return(ans);
    ', includes = "#include <Rmath.h>")
    

    A comparison of results:

    set.seed(007); ffR(5)
    #[1] 0.00000000 0.39774545 0.11569778 0.06974868 0.24374939
    set.seed(007); ffC(5)
    #[1] 0.00000000 0.39774545 0.11569778 0.06974868 0.24374939
    

    A comparison of speed:

    microbenchmark::microbenchmark(ffR(1e5), ffC(1e5), times = 20)
    #Unit: milliseconds
    #       expr        min         lq     median         uq        max neval
    # ffR(1e+05) 497.524808 519.692781 537.427332 668.875402 692.598785    20
    # ffC(1e+05)   2.916289   3.019473   3.133967   3.445257   4.076541    20
    

    And for the sake of completeness:

    set.seed(101); ans1 = ffR(1e5)
    set.seed(101); ans2 = ffC(1e5)
    all.equal(ans1, ans2)
    #[1] TRUE
    

    Hope any of this could be helpful in some way.