Search code examples
rfunctionradixapply

R: rownames, colnames, dimnames and names in apply


I would like to use apply to run across the rows of a matrix, and I would like to use the rowname of the current row in my function. It seems you can't use rownames, colnames, dimnames or names directly inside the function. I am aware that I can probably create a workaround based on information in this question.

But my question is how does apply handle row and column names of the array in it's first argument, and the assignment of names to objects created inside the function called by apply? It seems a bit inconsistent, as I hope to show by the following example. Is there a reason why it was designed like this?

# Toy data
m <- matrix( runif(9) , nrow = 3 )
rownames(m) <- LETTERS[1:3]
colnames(m) <- letters[1:3]
m
          a         b           c
A 0.5092062 0.3786139 0.120436569
B 0.7563015 0.7127949 0.003358308
C 0.8794197 0.3059068 0.985197273

# These return NULL
apply( m , 1 , FUN = function(x){ rownames(x) } )
NULL
apply( m , 1 , FUN = function(x){ colnames(x) } )
NULL
apply( m , 1 , FUN = function(x){ dimnames(x) } )
NULL

# But...
apply( m , 1 , FUN = function(x){ names(x) } )
     A   B   C  
[1,] "a" "a" "a"
[2,] "b" "b" "b"
[3,] "c" "c" "c"
# This looks like a column-wise matrix of colnames, with the rownames of m as the column names to me

# And further you can get...
n <- apply( m , 1 , FUN = function(x){ names(x) } )
dimnames(n)
[[1]]
NULL

[[2]]
[1] "A" "B" "C"

# But you can't do...
apply( m , 1 , FUN = function(x){ n <- names(x); dimnames(n) } )
NULL

I just want to understand what happens internally in apply? Many thanks.


Solution

  • I think your confusion stems from the fact that apply does not pass an array (or matrix) to the function specified in FUN.

    It passes each row of the matrix in turn. Each row is itself "only" a (named) vector:

    > m[1,]
             a          b          c 
    0.48768161 0.61447934 0.08718875 
    

    So your function has only this named vector to work with.

    For your middle example, as documented in apply:

    If each call to FUN returns a vector of length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1. If n equals 1, apply returns a vector if MARGIN has length 1 and an array of dimension dim(X)[MARGIN] otherwise.

    So function(x) names(x) returns a vector of length 3 for each row, so the final result is the matrix you see. But that matrix is being constructed at the end of the apply function, on the results of FUN being applied to each row individually.