Search code examples
drake-r-package

Is it possible to split subtargets from a dynamic branching target such that later dynamic branching targets get just one subsubtarget?


I have a plan like:

plan = drake::drake_plan(

    targ1 = target(
        f1(input)
        , dynamic = map(input)
    )

    , targ2 = target(
        f2(targ1)
        , dynamic = map(targ1)
    )
)

Where the function f1 actually yields multiple chunks of output (say, in a list), and I'd like these multiple chunks to be processed separately when targ2 is computed. Is this possible?

Here's a minimal example:

f1 = function(x){
    return(list(x,x+1))
}

f2 = function(x){
    return(x*2)
}

input = c(1,99)

plan = drake::drake_plan(
    targ1 = target(
        f1(input)
        , dynamic = map(input)
    )
    , targ2 = target(
        f2(targ1)
        , dynamic = map(targ1)
    )
)
drake::make(
    plan
)

Where as coded, drake gets an error in processing targ2 because the list in each subtarget from targ1 hasn't been broken apart yet. Obviously I could rewrite f2 to iterate over the list, but this was for demonstration purposes and in my actual use case there are good reasons for wanting to simply split out the results from targ1.

I thought I had it solved with:

f1 = function(x){
    return(list(x,x+1))
}

f2 = function(x){
    return(x*2)
}

input = c(1,99)

plan = drake::drake_plan(
    targ1 = target(
        f1(input)
        , dynamic = map(input)
    )
    , targ2 = target(
        unlist(targ1)
    )
    , targ3 = target(
        f2(targ2)
        , dynamic = map(targ2)
    )
)

But in my real use case each subtarget takes up a lot of memory, and the computation of targ2 appears to necessitate bringing them all into memory, causing a lock up as my machine runs out of memory.

I've worked out a hack where I save the individual list elements from each subtarget in targ1 to file then do a list_files() search for all such files as input to later targets, but maybe there's a simpler?

Here's the hack that's "working" but surely less than ideal:

library(drake)

f1 = function(x){
    out = list(x,x+1)
    for(i in 1:length(out)){
        a = out[[i]]
        save(a,file=paste0(digest::digest(a),'.rda'))
    }
    return(digest::digest(out))
}

f2 = function(x){
    list.files(pattern='.rda')
}

f3 = function(this_rda){
    load(this_rda)
    return(a)
}

f4 = function(x){
    return(x*2)
}

input = c(1,99)

plan = drake::drake_plan(
    targ1 = target(
        f1(input)
        , dynamic = map(input)
    )
    , targ2 = target(
        f2(targ1)
    )
    , targ3 = target(
        f3(targ2)
        , dynamic = map(targ2)
    )
    , targ4 = target(
        f4(targ3)
        , dynamic = map(targ3)
    )
)
drake::make(plan)
readd(targ4)


Solution

  • drake does not support dynamic branching within dynamic sub-targets, but you can combine static branching with dynamic branching to achieve something very similar.

    library(drake)
    input_values <- c(1, 99)
    plan <- drake_plan(
      targ1 = target(
        f1(input),
        transform = map(input = !!input_values)
      ),
      targ2 = target(
        f2(targ1),
        transform = map(targ1),
        dynamic = map(targ1)
      )
    )
    
    drake_plan_source(plan)
    #> drake_plan(
    #>   targ1_1 = f1(1),
    #>   targ1_99 = f1(99),
    #>   targ2_targ1_1 = target(
    #>     command = f2(targ1_1),
    #>     dynamic = map(targ1_1)
    #>   ),
    #>   targ2_targ1_99 = target(
    #>     command = f2(targ1_99),
    #>     dynamic = map(targ1_99)
    #>   )
    #> )
    

    Created on 2020-05-28 by the reprex package (v0.3.0)