Search code examples
rlanguage-lawyerr-faq

A comprehensive survey of the types of things in R; 'mode' and 'class' and 'typeof' are insufficient


The language R confuses me. Entities have modes and classes, but even this is insufficient to fully describe the entity.

This answer says

In R every 'object' has a mode and a class.

So I did these experiments:

> class(3)
[1] "numeric"
> mode(3)
[1] "numeric"
> typeof(3)
[1] "double"

Fair enough so far, but then I passed in a vector instead:

> mode(c(1,2))
[1] "numeric"
> class(c(1,2))
[1] "numeric"
> typeof(c(1,2))
[1] "double"

That doesn't make sense. Surely a vector of integers should have a different class, or different mode, than a single integer? My questions are:

  • Does everything in R have (exactly one) class ?
  • Does everything in R have (exactly one) mode ?
  • What, if anything, does 'typeof' tell us?
  • What other information is needed to fully describe an entity? (Where is the 'vectorness' stored, for example?)

Update: Apparently, a literal 3 is just a vector of length 1. There are no scalars. OK But... I tried mode("string") and got "character", leading me to think that a string was a vector of characters. But if that was true, then this should be true, but it's not! c('h','i') == "hi"


Solution

  • I agree that the type system in R is rather weird. The reason for it being that way is that it has evolved over (a long) time...

    Note that you missed one more type-like function, storage.mode, and one more class-like function, oldClass.

    So, mode and storage.mode are the old-style types (where storage.mode is more accurate), and typeof is the newer, even more accurate version.

    mode(3L)                  # numeric
    storage.mode(3L)          # integer
    storage.mode(`identical`) # function
    storage.mode(`if`)        # function
    typeof(`identical`)       # closure
    typeof(`if`)              # special
    

    Then class is a whole different story. class is mostly just the class attribute of an object (that's exactly what oldClass returns). But when the class attribute is not set, the class function makes up a class from the object type and the dim attribute.

    oldClass(3L) # NULL
    class(3L) # integer
    class(structure(3L, dim=1)) # array
    class(structure(3L, dim=c(1,1))) # matrix
    class(list()) # list
    class(structure(list(1), dim=1)) # array
    class(structure(list(1), dim=c(1,1))) # matrix
    class(structure(list(1), dim=1, class='foo')) # foo
    

    Finally, the class can return more than one string, but only if the class attribute is like that. The first string value is then kind of the main class, and the following ones are what it inherits from. The made-up classes are always of length 1.

    # Here "A" inherits from "B", which inherits from "C"
    class(structure(1, class=LETTERS[1:3])) # "A" "B" "C"
    
    # an ordered factor:
    class(ordered(3:1)) # "ordered" "factor"