I'm using the mclapply
function in the multicore
package to do parallel processing. It seems that all child processes started produce the same names for temporary files given by the tempfile
function. i.e. if I have four processors,
library(multicore)
mclapply(1:4, function(x) tempfile())
will give four exactly same filenames. Obviously I need the temporary files to be different so that the child processes don't overwrite each others' files. When using tempfile
indirectly, i.e. calling some function that calls tempfile
I have no control over the filename.
Is there a way around this? Do other parallel processing packages for R (e.g. foreach
) have the same problem?
Update: This is no longer an issue since R 2.14.1.
CHANGES IN R VERSION 2.14.0 patched:
[...]
o tempfile() on a Unix-alike now takes the process ID into account.
This is needed with multicore (and as part of parallel) because
the parent and all the children share a session temporary
directory, and they can share the C random number stream used to
produce the uniaue part. Further, two children can call
tempfile() simultaneously.
At least for now, I chose to monkey-patch my way around this by using the following code in my .Rprofile
following Daniel's advice to use PID values.
assignInNamespace("tempfile.orig", tempfile, ns="base")
.tempfile = function(pattern="file", tmpdir=tempdir())
tempfile.orig(paste(pattern, Sys.getpid(), sep=""), tmpdir)
assignInNamespace("tempfile", .tempfile, ns="base")
Obviously it's not a good option for any package you'd distribute, but for a single user's need it's the best option thus far since it works in all cases.