Search code examples
raggregatetapply

Multiple functions in a single tapply or aggregate statement


Is it possible to include two functions within a single tapply or aggregate statement?

Below I use two tapply statements and two aggregate statements: one for mean and one for SD.
I would prefer to combine the statements.

my.Data = read.table(text = "
  animal    age     sex  weight
       1  adult  female     100
       2  young    male      75
       3  adult    male      90
       4  adult  female      95
       5  young  female      80
", sep = "", header = TRUE)

with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x)}))
with(my.Data, tapply(weight, list(age, sex), function(x) {sd(x)  }))

with(my.Data, aggregate(weight ~ age + sex, FUN = mean)
with(my.Data, aggregate(weight ~ age + sex, FUN =   sd)

# this does not work:

with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x) ; sd(x)}))

# I would also prefer that the output be formatted something similar to that 
# show below.  `aggregate` formats the output perfectly.  I just cannot figure 
# out how to implement two functions in one statement.

  age    sex   mean        sd
adult female   97.5  3.535534
adult   male     90        NA
young female   80.0        NA
young   male     75        NA

I can always run two separate statements and merge the output. I was just hoping there might be a slightly more convenient solution.

I found the answer below posted here: Apply multiple functions to column using tapply

f <- function(x) c(mean(x), sd(x))
do.call( rbind, with(my.Data, tapply(weight, list(age, sex), f)) )

However, neither the rows or columns are labeled.

     [,1]     [,2]
[1,] 97.5 3.535534
[2,] 80.0       NA
[3,] 90.0       NA
[4,] 75.0       NA

I would prefer a solution in base R. A solution from the plyr package was posted at the link above. If I can add the correct row and column headings to the above output, it would be perfect.


Solution

  • But these should have:

    with(my.Data, aggregate(weight, list(age, sex), function(x) { c(MEAN=mean(x), SD=sd(x) )}))
    
    with(my.Data, tapply(weight, list(age, sex), function(x) { c(mean(x) , sd(x) )} ))
    # Not a nice structure but the results are in there
    
    with(my.Data, aggregate(weight ~ age + sex, FUN =  function(x) c( SD = sd(x), MN= mean(x) ) ) )
        age    sex weight.SD weight.MN
    1 adult female  3.535534 97.500000
    2 young female        NA 80.000000
    3 adult   male        NA 90.000000
    4 young   male        NA 75.
    

    The principle to be adhered to is to have your function return "one thing" which could be either a vector or a list but cannot be the successive invocation of two function calls.