Search code examples
rdatatablestargazer

Summarize the same variables from multiple dataframes in one table


I have voter and party-data from several datasets that I further separated into different dataframes and lists to make it comparable. I could just use the summary command on each of them individually then compare manually, but I was wondering whether there was a way to get them all together and into one table?

Here's a sample of what I have:

> summary(eco$rilenew)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      3       4       4       4       4       5 
> summary(ecovoters)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  0.000   3.000   4.000   3.744   5.000  10.000      26 
> summary(lef$rilenew)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.000   3.000   3.000   3.692   4.000   7.000 
> summary(lefvoters)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  0.000   2.000   3.000   3.612   5.000  10.000     332
> summary(soc$rilenew)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.000   4.000   4.000   4.143   5.000   6.000 
> summary(socvoters)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  0.000   3.000   4.000   3.674   5.000  10.000     346 

Is there a way I can summarize these lists (ecovoters, lefvoters, socvoters etc) and the dataframe variables (eco$rilenew, lef$rilenew, soc$rilenew etc) together and have them in one table?


Solution

  • You could put everything into a list and summarize with a small custom function.

    L <- list(eco$rilenew, ecovoters, lef$rilenew, 
              lefvoters, soc$rilenew, socvoters)
    
    t(sapply(L, function(x) {
      s <- summary(x)
      length(s) <- 7
      names(s)[7] <- "NA's"
      s[7] <- ifelse(!any(is.na(x)), 0, s[7])
      return(s)
      }))
               Min.   1st Qu.   Median     Mean  3rd Qu.      Max. NA's
    [1,]  0.9820673 3.3320662 3.958665 3.949512 4.625109  7.229069    0
    [2,] -4.8259384 0.5028293 3.220546 3.301452 6.229384  9.585749   26
    [3,] -0.3717391 2.3280366 3.009360 3.013908 3.702156  6.584659    0
    [4,] -2.6569493 1.6674330 3.069440 3.015325 4.281100  8.808432  332
    [5,] -2.3625651 2.4964361 3.886673 3.912009 5.327401 10.349040    0
    [6,] -2.4719404 1.3635785 2.790523 2.854812 4.154936  8.491347  346
    

    Data

    set.seed(42)
    eco <- data.frame(rilenew=rnorm(800, 4, 1))
    ecovoters <- rnorm(75, 4, 4)
    ecovoters[sample(length(ecovoters), 26)] <- NA
    lef <- data.frame(rilenew=rnorm(900, 3, 1))
    lefvoters <- rnorm(700, 3, 2)
    lefvoters[sample(length(lefvoters), 332)] <- NA
    soc <- data.frame(rilenew=rnorm(900, 4, 2))
    socvoters <- rnorm(700, 3, 2)
    socvoters[sample(length(socvoters), 346)] <- NA