Search code examples
rlapplymclapply

Versions of lapply() and mclapply() that avoid redundant processing


I am looking for versions of lapply() and mclapply() that only process unique elements of the argument list X. Does something like this already exist?

EDIT: In other words, I want lapply() to not bother processing duplicates, but I want length(lapply(X, ...)) to equal length(X), not length(unique(X)) (and the appropriate values to match). Also, I am assuming each element of X is rather small, so taking unique values should not be too much trouble.

Current behavior:

long_computation <- function(task){
  cat(task, "\n")
# Sys.sleep(1000) # 
  return(task)
}
tasks <- rep(LETTERS[1:2], 2)
lapply(tasks, long_computation)

## A
## B
## A
## B
## [[1]]
## [1] "A"
## 
## [[2]]
## [1] "B"
## 
## [[3]]
## [1] "A"
## 
## [[4]]
## [1] "B"

Desired behavior:

lapply(tasks, long_computation)

## A
## B
## [[1]]
## [1] "A"
## 
## [[2]]
## [1] "B"
## 
## [[3]]
## [1] "A"
## 
## [[4]]
## [1] "B"

You can find the intended use case here.


Solution

  • This actually seems to work:

    lightly_parallelize_atomic <- function(X, FUN, jobs = 1, ...){
      keys <- unique(X)
      index <- match(X, keys)
      values <- mclapply(X = keys, FUN = FUN, mc.cores = jobs, ...)
      values[index]
    }
    

    And in my case, it's okay that X is atomic.

    But it would be neat to find something already built into either a package or R natively.