Search code examples
rparallel-processingfuturer-future

future waits for previous future to conclude when sending to remote


I do the following to send a bunch of models to a compute server.

future waits for the first call to wrap, before the next is sent. How do I tell future that it can send multiple jobs to the remote at the same time?

This is clearly possible as I can send multiple jobs to the same remote from different local R sessions, or if I call plan(login) again between calls. But how do I specify the topology so future doesn't wait and I don't have to repetitively call plan?

library(future) 
login <- tweak(remote, workers = "me@localcomputeserver.de")
plan(list(login))
bla %<-% { bla <- rnorm(1000); Sys.sleep(100); saveRDS(bla, file="bla.rds"); bla}
bla2 %<-% { bla2 <- rnorm(1000); Sys.sleep(100); saveRDS(bla2, file="bla2.rds"); bla2 }

Solution

  • Author of future here: If you're happy with separate R processes on your remote machine, you can use:

    library("future")
    remote_machine <- "me@localcomputeserver.de"
    plan(cluster, workers = rep(remote_machine, times = 2L))
    

    to get two remote workers on the same machine. That way you can have two active futures at the same time without blocking.

    FYI, plan(remote, ...) is basically just plan(cluster, persistent = TRUE, ...), where "persistent" means that the R variables survive on the worker across multiple future calls; you rarely want to do that - so use cluster instead.