Search code examples
rlistfunctionreturnappend

appending function results to an list in R


I have created some functions that produce plots. The goal is to store the functions outcome in one list analysisObjects. analysisObjects is a list of lists. The code below produces the outcome exactly as expected.

createPlot1 <- function(data = mtcars){
  
 plot1 <- ggplot(data, aes(factor(cyl))) +
    geom_boxplot() + coord_flip()
 
 plot2 <- ggplot(data, aes(factor(cyl))) +
    geom_boxplot() 
  
  res = list("mycars" = list(list("objType" = "plot",
                                                   "object" =  plot1),
                                              list("objType" = "plot",
                                                   "object" =  plot2)))
  
  analysisObjects <<- append(analysisObjects, res)
  
}


createPlot2 <- function(data = iris){
  
 plot1 <-  ggplot(data, aes(x = Sepal.Length, y = Sepal.Width)) +
   geom_point(aes(color=Species, shape=Species)) 
 
 plot2 <- ggplot(data, aes(x=Species, y=Sepal.Length)) +
   geom_boxplot(aes(fill=Species)) 
  
  res <- list("myflowers" = list(list("objType" = "plot",
                                            "object" =  plot1),
                                       list("objType" = "plot",
                                            "object" =  plot2)))
  
 analysisObjects <<- append(analysisObjects, res)
  
}

The problem is, that the analysisObject has to be created a priory and it is overwritten every time I run a function. I was trying to avoid this by writing a separate function, which appends the outcome from a function into analysisObjects, if it exists. If analysisObjects doesnt exist, the list is created:

appendResults <- function(object = res){
  if(exists("analysisObjects")){
    analysisObjects <<- append(analysisObjects, object)
  }else{
    analysisObjects <<- list()
    analysisObjects <<- append(analysisObjects, object)
  }
}

this function replaces the line analysisObjects <<- append(analysisObjects, res), but the process still seems quite ugly. Is there a better way to appending results of a function to a list? Ideally one, where the order of execution of the functions doesnt matter.


Solution

  • I believe you're approaching your problem with the wrong premises.

    Why don't you try something like this:

    library(ggplot2)
    
    # generic for methods dispatch
    createPlot <- function(data){
     
     UseMethod("createPlot")
     
    }
    
    # method for class "someclass"
    createPlot.someclass <- function(data){
     
     plot1 <- ggplot(data, aes(factor(cyl))) +
      geom_boxplot() + coord_flip()
     
     plot2 <- ggplot(data, aes(factor(cyl))) +
      geom_boxplot() 
     
     list(list("objType" = "plot", "object" =  plot1),
          list("objType" = "plot", "object" =  plot2))
     
    }
    
    # method for class "someotherclass"
    createPlot.someotherclass <- function(data){
     
     plot1 <-  ggplot(data, aes(x = Sepal.Length, y = Sepal.Width)) +
      geom_point(aes(color=Species, shape=Species)) 
     
     plot2 <- ggplot(data, aes(x=Species, y=Sepal.Length)) +
      geom_boxplot(aes(fill=Species)) 
     
     list(list("objType" = "plot", "object" =  plot1),
          list("objType" = "plot", "object" =  plot2))
    
    }
    
    # prepare your data and assign the right class. It will be needed for dispatch
    class(mtcars) <- c("someclass"     , class(mtcars))
    class(iris)   <- c("someotherclass", class(iris))
    
    # make a list of your dataframes and give it a name to each one (if you want to)
    mylist <- list(mycars = mtcars, myflowers = iris)
    
    # create your list
    analysisObjects <- lapply(mylist, createPlot)
    

    The idea is that you create analysisObjects once and you create it in the best and cleanest way.

    You want to have a list at the end, so I suppose you want to loop over some dataframes to get a final result.

    What you can do is to prepare all your dataframes at the beginning and then at the end plot them all in a smart way.

    You can exploit classes and methods, for example.

    createPlot is a generic that dispatch thought your classes.

    I just invented some names for your classes but you can make prettier names.

    At the end you loop throught your data. Give out a name to each dataframe: that will be the name of each item of the list.


    EDIT to answer your questions

    If you duplicate a dataframe to assign different classes you don't get much more extra space in your memory.

    Look at this example. We use the function lobstr::obj_size to see the actual size of RAM used by R.

    i2 <- i1 <- iris
    
    lobstr::obj_size(i1)
    #> 7,200 B
    lobstr::obj_size(i2)
    #> 7,200 B
    lobstr::obj_size(list(i1,i2))
    #> 7,264 B
    
    class(i1) <- c("someclass", class(i1))
    class(i2) <- c("someotherclass", class(i2))
    
    lobstr::obj_size(i1)
    #> 7,272 B
    lobstr::obj_size(i2)
    #> 7,272 B
    lobstr::obj_size(list(i1,i2))
    #> 7,728 B
    

    As you can see R is optimized so that even if you "create" another object it actually calls the same space in the RAM memory and it doesn't replicate everything for no reason.

    If you edit a column of i2, then it will replicate just that one column.

    So if you need to assign different classes and "replicate" your dataset is not a problem.

    However, if for one specific data set you need to call two different methods you can do it this way:

    createPlot.someotherclassyet <- function(data){
    
      out1 <- createPlot.someclass(data)
      out2 <- createPlot.someotherclass(data)
    
      c(out1, out2)
    
    }
    
    class(iris) <- c("someotherclassyet", class(iris))
    createPlot(iris)
    

    This gets the job done, but it's not really clean.

    It would be better to have a function that transform a class into another one, just in case in the future you need to create checks or extra transformations to make a class as such.

    ### someclass
    as_someclass <- function(x){
    
      UseMethod("as_someclass")
    
    }
    
    as_someclass.someotherclassyet <- function(x){
      
      class(x) <- setdiff(class(x), "someotherclassyet")
      as_someclass(x)
    
    }
    
    as_someclass.data.frame <- function(x){
      
      class(x) <- c("someclass", class(x))
      x
    
    }
    
    
    
    ### someotherclass
    
    as_someotherclass <- function(x){
    
      UseMethod("as_someotherclass")
    
    }
    
    as_someotherclass.someotherclassyet <- function(x){
      
      class(x) <- setdiff(class(x), "someotherclassyet")
      as_someotherclass(x)
    
    }
    
    as_someotherclass.data.frame <- function(x){
      
      class(x) <- c("someotherclass", class(x))
      x
    
    }
    
    
    ### someotherclassyet
    as_someotherclassyet <- function(x){
    
      UseMethod("as_someotherclassyet")
    
    }
    
    as_someotherclassyet.data.frame <- function(x){
      
      class(x) <- c("someotherclassyet", class(x))
      x
    
    }
    
    
    createPlot.someotherclassyet <- function(data){
    
      out1 <- createPlot(as_someclass(data))
      out2 <- createPlot(as_someotherclass(data))
    
      c(out1, out2)
    
    }
    
    
    lapply(list(myflowers      = as_someclass(iris),
                myotherflowers = as_someotherclass(iris),
                allmyflowers   = as_someotherclassyet(iris),
                mycars         = as_someotherclass(mtcars)),
           createPlot)
    

    If you want to perform createPlot on a list of object you can create your list and assign a class.