Search code examples
rparallel-processingapplysnowfall

R snowfall : parallel apply on table columns


I have a table M with many columns and rows, obtained from a text file :

M <- read.table("text.csv",header=TRUE,sep="\t")

To obtain the ranks by columns I successfully used :

M <- apply(M,2,rank)

I would like to speed up the computation but I did not succeed to implement this function in snowfall.

I tried :

library(snowfall)
sfStop()
nb.cpus <- 8
sfInit(parallel=TRUE, cpus=nb.cpus, type = "SOCK")
M <- sfClusterApplyLB(M, rank) # does not work
M <- sfClusterApply(M,2,rank) # does not work
M <- sfClusterApplyLB(1:8, rank,M) # does not work

What is the equivalent of M <- apply(M,2,rank) in snowfall ?

Thanks in advance for your help !


Solution

  • The equivalent of apply in snowfall is sfApply. Here's an example:

    library(snowfall)
    sfInit(parallel=TRUE, cpus=4, type="SOCK")
    M <- data.frame(matrix(rnorm(40000000), 2000000, 20))
    r <- sfApply(M, 2, rank)
    sfStop()
    

    This example runs almost twice as fast as the sequential version on my Linux machine using four cores. That's not too bad considering that rank isn't very computationally intensive.