Search code examples
rslurmreference-class

slurm_apply a RefClass method from within a RefClass method


EDIT: New version of rslurm makes the solution very easy. See my answer below.

Apologies for the somewhat longer than desired MWE, and a title that I realize after submitting the question may be needlessly complicated. I believe the real issue is getting the environment of a RefClass object into rslurm::slurm_apply.

MWE

Here I define a toy reference class called BankAccount. It has two fields and two methods.

The fields are transactions, a list of all transactions associated with the account and suspicion_threshold the value above which the bank will investigate the transaction.

The two methods are is_suspicious which compares the transactions with the suspicion_threshold on the local machine and is_suspicious_slurm, which uses rslurm::slurm_apply to spread many calls to is_suspicious over a cluster of computers managed by SLURM. You can imagine if there were many transactions or if the is_suspicious function were more complex, this might be necessary.

So, here's the setup

BankAccount <- setRefClass(
    Class = 'BankAccount',
    fields = list(
        transactions = 'numeric',
        suspicion_threshold = 'numeric'
    )
)

BankAccount$methods(
    is_suspicious = function(start_idx = 1, stop_idx = length(transactions)) {

    return(start_idx + which(transactions[start_idx:stop_idx] > suspicion_threshold) - 1)

    }
)

BankAccount$methods(
    is_suspicious_slurm = function(num_nodes) {

        usingMethods(is_suspicious)

        t <- length(transactions)
        t_per_n <- floor(t/num_nodes)
        starts <- seq(from = 1, length.out = num_nodes, by = t_per_n)
        stops <- seq(from = t_per_n, length.out = num_nodes,  by = t_per_n)
        stops[num_nodes] <- t

        sjob <- rslurm::slurm_apply(f = is_suspicious,
                                    params = data.frame(start_idx = starts,
                                                        stop_idx = stops),
                                    nodes = num_nodes,
                                    add_objects = .self)

        results_list <- rslurm::get_slurm_out(slr_job = sjob,
                                              outtype = "raw",
                                              wait = TRUE)

        return(unlist(results_list))
    }
)

Now, on my local machine I can run:

library(RCexampleforSE)

set.seed(27599)

b <- BankAccount$new()
b$transactions <- rnorm(n = 500)
b$suspicion_threshold <- 2

b$is_suspicious()
b$is_suspicious_slurm(num_nodes = 3)

and it works as expected:

62 103 155 171 182 188 297 398 493 499

If I run:

b$is_suspicious_slurm(num_nodes = 3)

I get an error, since my personal computer is not connected to a SLURM cluster.

sh: squeue: command not found Cannot submit; no SLURM workload manager on path Submission scripts output in directory _rslurm_13ba46e3c70b0 Error in rslurm::get_slurm_out(slr_job = sjob, outtype = "raw", wait = TRUE): slr_job has not been submitted

If I logon to my university cluster, which uses SLURM, and run the same script, the setup and local methods work just as they did on my personal computer. When I run:

b$is_suspicious_slurm(num_nodes = 3)

it sends jobs to the cluster, as hoped for:

Submitted batch job 6363868

But these jobs error immediately with the following error message in slurm_0.out, slurm_1.out, and slurm_2.out:

Error in attr(, "mayCall") : argument 1 is empty Execution halted

Thoughts and Attempts

I figure the job probably needs, but doesn't have available, the BankAccount object. So I tried passing it in as add_objects parameter to rslurm::slurm_apply:

sjob <- rslurm::slurm_apply(f = is_suspicious,
                                    params = data.frame(start_idx = starts,
                                                        stop_idx = stops),
                                    nodes = num_nodes,
                                    add_objects = .self)

I also tried it in quotes and inside eval(), neither of which worked.

How can I make the object accessible to the worker jobs created with rslurm::slurm_apply?


Solution

  • Version 0.4.0 of rslurm completely solved this problem.

    Define is_suspicious_slurm() as:

    BankAccount$methods(
        is_suspicious_slurm = function(num_nodes) {
    
            usingMethods(is_suspicious)
    
            t <- length(transactions)
            t_per_n <- floor(t/num_nodes)
            starts <- seq(from = 1, length.out = num_nodes, by = t_per_n)
            stops <- seq(from = t_per_n, length.out = num_nodes,  by = t_per_n)
            stops[num_nodes] <- t
    
            sjob <- rslurm::slurm_apply(f = is_suspicious,
                                        params = data.frame(start_idx = starts,
                                                            stop_idx = stops),
                                        nodes = num_nodes)
    
            results_list <- rslurm::get_slurm_out(slr_job = sjob,
                                                  outtype = "raw",
                                                  wait = TRUE)
    
            return(unlist(results_list))
        }
    )
    

    The only change is that in the call to rslurm::slurm_apply, the add_objects parameter is not specified. It does not need to be specified because as @Ian pointed out:

    "...you don't need to pass self at all when slurm_apply sends the serialized function, which appears to include both ".self" and "transactions" in the enclosing environment."