Search code examples
rxtable

Table with function in R


I need to do some descriptive statistic on a dataset. I need to create a table from a dataset that give me, for each level in a factor the mean of another variable.

city   mean(age) 
 1       14    
 2       15    
 3       23    
 4       34    

Which is the fastest way to do it in R?

Another thing that I have to do is the same thing, but on 2 dimensions:

mean(age)   male   female 
 city      
 1          12       13     
 2          15       16
 3          21       22
 4          34       33

And I wonder if there is also the possibility to apply also other functions like max, min,sum....

Edit: I add a dataset to create examples easier:

data.frame(years=rep(c(12,13,14,15,15,16,34,67,45,78,17,42),2),sex=rep(c("M","F"),12),city=rep(c(1,2,3,4,4,3,2,1),3))  

Solution

  • Could try (added data.table package for faster dcast on big data sets)

    library(data.table)
    library(reshape2)
    dcast.data.table(setDT(dato), city ~ sex, value.var = "years", fun = mean)
    
    #    city        F        M
    # 1:    1 41.33333 24.00000
    # 2:    2 35.66667 21.66667
    # 3:    3 35.66667 21.66667
    # 4:    4 41.33333 24.00000
    

    You could also just use data.table in a regular way

    dato <- setkey(setDT(dato)[, list(mean = mean(years)), by = list(city, sex)])
    
    #    city sex     mean
    # 1:    1   F 41.33333
    # 2:    1   M 24.00000
    # 3:    2   F 35.66667
    # 4:    2   M 21.66667
    # 5:    3   F 35.66667
    # 6:    3   M 21.66667
    # 7:    4   F 41.33333
    # 8:    4   M 24.00000
    

    Or dplyr package (also very fast)

    library(dplyr)
    dato %>%
      group_by(city, sex) %>%
          summarize(mean(years))
    
    #   city sex mean(years)
    # 1    1   F    41.33333
    # 2    1   M    24.00000
    # 3    2   F    35.66667
    # 4    2   M    21.66667
    # 5    3   F    35.66667
    # 6    3   M    21.66667
    # 7    4   F    41.33333
    # 8    4   M    24.00000