Search code examples
rterra

How can I incoporate terra objects in the output list of a function?


I am currently updating the virtualspecies R package to shift from raster to terra.

In this package I used to ship rasters from the package raster in the output list of functions. These outputs were a simple list with an appended class called virtualspecies where one to two elements of the list were objects of class RasterLayer or RasterStack.

For example, here is how an output object was created in this package, taken from here:

  results <- list(approach = "pca",
                  details = list(variables = sel.vars,
                                 pca = pca.object,
                                 rescale = rescale,
                                 axes = axes, 
                                 means = means,
                                 sds = sds),
                  suitab.raster = suitab.raster)
  class(results) <-  append("virtualspecies", class(results))

suitab.raster in this example was an object of class RasterLayer.

These virtualspecies objects are part of a workflow of functions where the next function would take the raster(s) from the list, make new calculations, and oftentimes add another raster into the output list. These objects could be exported with e.g. saveRDS() and kept their functionality.

To shift this package from raster to terra, my question is: How can I incorporate terra objects into simple output lists from a function in a way that ensures these objects remain self-contained and can be easily transferred between different R sessions or computers?

Note that I would like to keep the objects self-contained as much as possible, since this is one of the arguments we used in the paper where we introduced the virtualspecies package:

"the generated virtual species can be stored on the hard disk drive and provided in online supplementary materials of articles"

I am currently considering three possibilities:

  1. Use the wrap() function from package terra to store spatRasters objects in output lists. This would probably imply to often wrap/unwrap objects between functions of the package. However, as far as I understood (I am only starting on this aspect) I would probably need to use proxy = FALSE to ensure that objects are self-contained and portable, which could be problematic for large rasters or low memory computers.

  2. Use writeRaster() and try to keep track of the locations where the rasters are saved in the output list. In that case the objects would no longer be self-contained. In addition, I am reluctant on writing files on the disk because I am not very familiar with the best practices of doing that inside a package, and how to ensure the objects will be portable across OS without adding too much complexity for users.

  3. Use a combination of 1. and 2.: use wrap() as the preferred method, but when it is not memory-safe or simply fails, then use writeRaster() and keep track of the locations where the file were written so that the object can be re-read into R memory when a function needs to access the raster.

Which of these approaches (or one not mentioned above) is objectively the best and why?


Solution

  • For now I have opted for solution #1, wrapping terra objects in the output list as follows:

    results <- list(approach = "pca",
                    details = list(variables = sel.vars,
                                   pca = pca.object,
                                   rescale = rescale,
                                   axes = axes, 
                                   means = means,
                                   sds = sds),
                    suitab.raster = wrap(suitab.raster,
                                         proxy = FALSE))
    class(results) <-  append("virtualspecies", class(results))
    

    And I added generic functions to $ and [[ so that when users want to access the rasters they will get an unwrapped raster, rather than a wrapped raster.

    By doing so I also unintentionally made the rest of the code much easier, since I don't have to add unwrap() everytime I want to access a raster object from one of these lists.

    Here is how I added these generics to automatically unwrap terra objects when accessed with $ or [[.

    #' @export
    #' @method `$` virtualspecies
    `$.virtualspecies` <- function(x, name)
    {
      if(inherits(as.list(x)[[name]], "PackedSpatRaster")) {
        return(unwrap(`[[`(as.list(x), name)))
      } else {
        return(`[[`(as.list(x), name))
      }
    }
    
    #' @export
    #' @method `[[` virtualspecies
    `[[.virtualspecies` <- function(x, name)
    {
      if(inherits(as.list(x)[[name]], "PackedSpatRaster")) {
        return(unwrap(`[[`(as.list(x), name)))
      } else {
        return(`[[`(as.list(x), name))
      }
    }
    
    #' @export
    #' @method `as.list` virtualspecies
    as.list.virtualspecies <- function(x)
    {
      class(x) <- "list"
      return(x)
    }
    

    This code works, but I am still uncertain about two aspects:

    1. What will happen if users try to wrap() very large rasters? I am not yet sure how to handle this situation.
    2. I wonder if the way I wrote the generic $ and [[ functions is adequate and will not cause issues in cases I have not tried yet.