Search code examples
rmethodssubsetenvironmentr-s4

Subsetting a custom S4 class using the "subset" function within another function


I'm trying to define a subset method for my custom S4 class. While subsetting works as intended when I provide the subsetting critearia directly to subset, the method fails whenever I call it within another function, where the subsetting criteria is passed on to subset from that function.

The S4 class myClass consists of a single data.frame:

# Define class
setClass("myClass", slots = c(data = "data.frame"))

# Initiate a myClass object
dat <- new("myClass", data = data.frame(Letter = c("A", "A", "B"), Number = c(1, 2, 3)))

To be able to subset the class based on the content of the data.frame in the slot data, I defined the following subsetmethod:


setMethod("subset", signature(x = "myClass"), function(x, ...) {
  x@data <- subset(x@data, ...)
  return(x)
})

The method works as expected when called as follows:

# Assume we only want to retain entries containing the letter "A"
whichletter <- "A"

# Subset (does work)
subset(dat, Letter %in% whichletter)
An object of class "myClass"
Slot "data":
  Letter Number
1      A      1
2      A      2

However, when I try to run subset within another function, where the subset criteria is provided through that function's arguments, the subsetting won't work:

# Random function that takes a letter `let`as argument
randomFunction <- function(object, let) {
  object_subsetted <- subset(object, Letter %in% let)
  return(object_subsetted)
}

# Subset (does not work)
randomFunction(object = dat, let = whichletter)
Error in Letter %in% let: object 'let' not found

This appears to be an issue with environments but I can't figure out what exactly is going wrong. Does anyone have a suggestion how to avoid this error?


Solution

  • I just found this and this which together seem to answer my question. The issue is unrelated to the use of S4 classes, but caused by how subset scopes variables. Initially, my definition of subset was looking for let inside the x@dataobject. By specifically defining to evaluate the expression in the scope of x@data but allowing to expand the scope to the parent.frame() (to which the variable letbelongs to) solves the error and allows me to subset as desired.

    Here is the full code:

    # Define class
    setClass("myClass", slots = c(data = "data.frame"))
    
    # Initiate a myClass object
    dat <- new("myClass", data = data.frame(Letter = c("A", "A", "B"), Number = c(1, 2, 3)))
    
    # Define a subset method
    setMethod("subset", signature(x = "myClass"), function(x, ...) {
      condition <- substitute(...)
      indices <- eval(condition, x@data, parent.frame())
      x@data <- x@data[indices, ]
      return(x)
    })
    
    # Suppose we want to subset to only retain entries with "A"
    whichletter <- "A"
    
    # What if we have a function that should pass the subsetting value to the subset
    # function?
    randomFunction <- function(object, let) {
      object_subsetted <- subset(object, Letter %in% let)
      return(object_subsetted)
    }
    
    # Test it (works now)
    randomFunction(object = dat, let = "A")
    randomFunction(object = dat, let = "B")
    
    An object of class "myClass"
    Slot "data":
      Letter Number
    1      A      1
    2      A      2
    
    An object of class "myClass"
    Slot "data":
      Letter Number
    3      B      3
    

    However, as highlighted by @JDL it's probably wiser to define a [ method instead.