Search code examples
rr-s4

Are `$` and `[[` equivalent when accessing elements of an S4 object?


I've always thought the $ and [[ accessors were essentially the same, based on reading various tutorials and posts( e.g., https://www.r-bloggers.com/2009/10/r-accessors-explained/). However, I am digging into an S4 object for the first time and got the surprising result below. Namely, $ works, but [[ does not.

> gse$Description %>% head(2)
[1] "cardiac muscle tissue development"  "striated muscle tissue development"
> gse[["Description"]] %>% head(2)
Error in `[[.gseaResult`(gse, "Description") : input term not found...
> gse %>% slot("Description") %>% head(2)
Error in slot(., "Description") : 
  no slot of name "Description" for this object of class "gseaResult"
> gse@result$Description %>% head(2)
[1] "cardiac muscle tissue development"  "striated muscle tissue development"
> gse@result[["Description"]] %>% head(2)
[1] "cardiac muscle tissue development"  "striated muscle tissue development"           

For reproducibility, here is the gse object:

> gse %>% dput()
new("gseaResult", result = structure(list(ID = c("GO:0048738", 
"GO:0014706", "GO:0055001", "GO:0051146", "GO:0140694"), Description = c("cardiac muscle tissue development", 
"striated muscle tissue development", "muscle cell development", 
"striated muscle cell differentiation", "non-membrane-bounded organelle assembly"
), setSize = c(280L, 302L, 230L, 341L, 391L), enrichmentScore = c(-0.583663510336891, 
-0.543426982282193, -0.559387452878044, -0.523546550894944, -0.475084590959071
), NES = c(-2.10733279137958, -1.97184487532844, -1.97137557942708, 
-1.91649873102392, -1.74800071168715), pvalue = c(1e-10, 1e-10, 
1e-10, 1e-10, 1e-10), p.adjust = c(8.2175e-08, 8.2175e-08, 8.2175e-08, 
8.2175e-08, 8.2175e-08), qvalue = c(7.73421052631579e-08, 7.73421052631579e-08, 
7.73421052631579e-08, 7.73421052631579e-08, 7.73421052631579e-08
), rank = c(1374, 1374, 1340, 1340, 3310), leading_edge = c("tags=25%, list=7%, signal=24%", 
"tags=24%, list=7%, signal=22%", "tags=23%, list=7%, signal=22%", 
"tags=21%, list=7%, signal=20%", "tags=22%, list=16%, signal=19%"
)), row.names = c("GO:0048738", "GO:0014706", "GO:0055001", "GO:0051146", 
"GO:0140694"), class = "data.frame"), organism = "Mus musculus", 
    setType = "BP", geneSets = list(), geneList = numeric(0), 
    keytype = "ALIAS", permScores = structure(numeric(0), dim = c(0L, 
    0L)), params = list(pvalueCutoff = 0.05, eps = 1e-10, pAdjustMethod = "BH", 
        exponent = 1, minGSSize = 10, maxGSSize = 500), gene2Symbol = character(0), 
    readable = FALSE, termsim = structure(numeric(0), dim = c(0L, 
    0L)), method = character(0), dr = list())
> gse %>% str()
Formal class 'gseaResult' [package "DOSE"] with 13 slots
  ..@ result     :'data.frame': 5 obs. of  10 variables:
  .. ..$ ID             : chr [1:5] "GO:0048738" "GO:0014706" "GO:0055001" "GO:0051146" ...
  .. ..$ Description    : chr [1:5] "cardiac muscle tissue development" "striated muscle tissue development" "muscle cell development" "striated muscle cell differentiation" ...
  .. ..$ setSize        : int [1:5] 280 302 230 341 391
  .. ..$ enrichmentScore: num [1:5] -0.584 -0.543 -0.559 -0.524 -0.475
  .. ..$ NES            : num [1:5] -2.11 -1.97 -1.97 -1.92 -1.75
  .. ..$ pvalue         : num [1:5] 1e-10 1e-10 1e-10 1e-10 1e-10
  .. ..$ p.adjust       : num [1:5] 8.22e-08 8.22e-08 8.22e-08 8.22e-08 8.22e-08
  .. ..$ qvalue         : num [1:5] 7.73e-08 7.73e-08 7.73e-08 7.73e-08 7.73e-08
  .. ..$ rank           : num [1:5] 1374 1374 1340 1340 3310
  .. ..$ leading_edge   : chr [1:5] "tags=25%, list=7%, signal=24%" "tags=24%, list=7%, signal=22%" "tags=23%, list=7%, signal=22%" "tags=21%, list=7%, signal=20%" ...
  ..@ organism   : chr "Mus musculus"
  ..@ setType    : chr "BP"
  ..@ geneSets   : list()
  ..@ geneList   : num(0) 
  ..@ keytype    : chr "ALIAS"
  ..@ permScores : num[0 , 0 ] 
  ..@ params     :List of 6
  .. ..$ pvalueCutoff : num 0.05
  .. ..$ eps          : num 1e-10
  .. ..$ pAdjustMethod: chr "BH"
  .. ..$ exponent     : num 1
  .. ..$ minGSSize    : num 10
  .. ..$ maxGSSize    : num 500
  ..@ gene2Symbol: chr(0) 
  ..@ readable   : logi FALSE
  ..@ termsim    : num[0 , 0 ] 
  ..@ method     : chr(0) 
  ..@ dr         : list()

Is the failure of gse[["Description"]] expected? I am surprised because I had the impression $ "is handiest when doing interactive programming but should be discouraged for more production oriented code because of its limitations, namely the inability to interpolate the names or use integer indices." Does this priciple not hold for S4 objects?

Relatedly, this tutorial (http://adv-r.had.co.nz/S4.html) says, "To access slots of an S4 object you use @, not $. Or if you have a character string giving a slot name, you use the slot function. This is the equivalent of [[." Is there a way to tell when $ will work (my example) vs. when @ is used?

Thanks!


Solution

  • $ and [[ are not exactly equivalent when accessing slots in an S4 object in R.

    • $ directly accesses the slots in the S4 object, without any dispatch or validation. This is why it works in your example to get the Description column.

    • [[ goes through the S4 dispatch mechanism, looking for a method like [[.gseaResult. This allows more control/validation, but means it will fail if no S4 method is defined.

    So in your case, gse$result$Description works because it directly accesses the slots. But gse[["Description"]] fails because there is no [[ method defined for the gseaResult S4 class.

    The @ operator is preferred for S4 because it goes through dispatch properly. gse@result would work in your example.

    As a rule of thumb:

    • Use $ for quick interactive programming with S4 objects
    • Use @ and [[ for more robust production code
    • Define [[ methods on your S4 classes if you want [[ access