R system() process always uses same CPU, not multi-threaded/multi-core

In R 3.0.2 on Linux 3.12.0, I am using the system() function to execute a number of tasks. The desired effect is for each of these tasks to run as they would if I had executed them on the command-line via Rscript outside of R system().

However, when executing them inside R via system(), each task is tied to the same single CPU from the master R process.

In other words:

When launched via RScript directly from a bash shell, outside of R, each task runs on its own core as possible (this is desired)

When launched inside R via system(), each task runs on the same single core. There is no multicore sharing. If I have 100 tasks, they are all stuck on one core.

I cannot figure out how to spawn a process inside of R so that each process will use its own core.

I am using a simple test to consume CPU cycles so I can measure the effect using top/htop:

dd if=/dev/urandom bs=32k count=1000 | bzip2 -9 >> /dev/null

When this simple test is launched outside of R multiple times, each iteration gets its own core. But when I launch it inside of R:

system("dd if=/dev/urandom bs=32k count=2000 | bzip2 -9 >> /dev/null", ignore.stdout=TRUE,ignore.stderr=TRUE,wait=FALSE)

They are all stuck on a single core.

Here is a visualization after running 4 simultaneous/concurrent iterations of system().

enter image description here

Please help me, I need to be able to tell R to launch new tasks, with each of them running in their own core.

UPDATE DEC 4 2013:

I tried a test in Python using this:

import thread
thread.start_new_thread(os.system,("/bin/dd if=/dev/urandom of=/dev/null bs=32k count=2000",))

I repeated the new thread several times, and as expected everything worked (multiple cores used, one per thread).

So I think install the rPython package in R, and try the same from within R:

python.exec("import thread")
python.exec("thread.start_new_thread(os.system,('/bin/dd if=/dev/urandom of=/dev/null bs=32k count=2000',))")

Unfortunately, once again it was limited to a single core even after repeated calls. Why is it that everything launched is limited to a single core when executed from R?

Solution

Following on @agstudy's comment, you should get parallel to work first. On my system, this uses multiple cores:

f<-function(x)system("dd if=/dev/urandom bs=32k count=2000 | bzip2 -9 >> /dev/null", ignore.stdout=TRUE,ignore.stderr=TRUE,wait=FALSE)
library(parallel)
mclapply(1:4,f,mc.cores=4)

I would have wrote this in a comment myself, but it is too long. I know you have said that you have tried the parallel package, but I wanted to confirm that you are using it correctly. If it doesn't work, can you confirm that a non-system call uses mclapply correctly, like this one?

a<-mclapply(rep(1e8,4),rnorm,mc.cores=4)

Reading your comments, I suspect that your pthreads Linux package is out of date and broken. On my system, I am using libpthread-2.15.so (not 2.13). If you're on Ubuntu, you can grab the latest with apt-get install libpthread-stubs0.

Also, note that you should be using parallel, not multicore. If you look at the docs for parallel, you'll note that they have incorporated the work on multicore.

Reading your next set of comments, I must insist that it is parallel and not multicore that has been included in R since 2.14. You can read about this on the CRAN Task View.

Getting parallel to work is crucial. I previously told you that you could compile it directly from source, but this is not correct. I guess the only way to recompile it would be to compile R from source.

Can you also verify that your CPU affinity is set correctly? Also can you check if R can detect the number of cores? Just run:

library(parallel)
mcaffinity()
# Should be c(1,2,3,4) for you.
detectCores()
# Should be 4 for you.