Search code examples
rhashmemoizationmemoise

Retrieve memoised Object using readRDS() and hash


Consider the following code:

library(cachem)
library(memoise)

cache.dir <- "/Users/abcd/Desktop/temp_cache/"
cache <- cachem::cache_disk(dir = cache.dir, max_size = 1024^2)

fun <- function (x) {x^2}

fun.memo <- memoise(f = fun, cache = cache)

res.1 <- fun.memo(x = 2)
res.2 <- fun.memo(x = 3)

So far so good. I can compute fun.memo once and retrieve its results later by calling it again.

Now I have the following "problem": I have a lengthy script with several memoised function calls. At the end I just want to further process the output of the last function call, which depends on the output of memoised function calls further up in the script. Now it would be nice if I can somehow retrieve the memoised objects directly from the .rds file in cache.dir. This would avoid the lengthy script on top (not for performance reasons [memoise], but to avoid lengthy code). I am thinking about something like:

setwd(cache.dir)
res.y <- readRDS(paste0(my.hash.1, ".rds"))
res.z <- readRDS(paste0(my.hash.2, ".rds"))

However, I can't generate those hashes in the filenames again:

rlang::hash(x = res.1)
rlang::hash(x = res.2)

rlang::hash(x = fun)
rlang::hash(x = fun.memo)

all yield different hashes. It seems that the hash generated within memoise is not the hash that gets written into the .rds filename.

I know that retrieving the objects like that is sub-optimal since then it is not clear what arguments they resulted from. Still it would be nice to avoid the lengthy code on top. Of course, I could wrap all the preceding code into a function or a script and source() it, but that's not the point here. Any advice?


Solution

  • I think you are somewhat wasting your time; but if you look inside memoise internals you can see how the keys are determined and you can hack your way to what the chosen key hashes are ... because in this case there arent _additionals I can boil it down ...

    
    library(cachem)
    library(memoise)
    
    cache.dir <- tempdir(check=TRUE)
    cache <- cachem::cache_disk(dir = cache.dir, max_size = 1024^2)
    
    fun <- function (x) {x^2}
    
    fun.memo <- memoise(f = fun, cache = cache)
    
    res.1 <- fun.memo(x = 2)
    
    list.files(path=cache.dir)
    # [1] "37513c63752949a0ae8d9befd52c6ad1.rds"   ....
    
    rlang::hash(c(
      rlang::hash(list(formals(fun), as.character(body(fun)))),
      list(x=2)))
    # 37513c63752949a0ae8d9befd52c6ad1
    

    Please don't do this :D