I have to convert a large number of RAW images and am using the program DCRAW to do that. Since this program is only using one core I want to parallelize this in R. To call this function I use:
system("dcraw.exe -4 -T image.NEF")
This results in outputting a file called image.tiff in the same folder as the NEF file, which is totally fine. Now I tried multiple R packages to parallelize this but I only get nonsensical returns (probably caused by me). I want to run a large list (1000+ files) through this system call in r , obtained by list.files()
I could only find info on parallel programming for variables within R but not for system calls. Anybody got any ideas? Thanks!
It doesnt' matter if you use variables or system
. Assuming you're not on Windows (which doesn't support parallel), on any decent system you can run
parallel::mclapply(Sys.glob("*.NEF"),
function(fn) system(paste("dcraw.exe -4 -T", shQuote(fn))),
mc.cores=8, mc.preschedule=F)
It will run 8 jobs in parallel. But then you may as well not use R and use instead
ls *.NEF | parallel -u -j8 'dcraw.exe -4 -T {}'
instead (using GNU parallel).