Search code examples
rdataframecomparesummary

Compare summary statistics for two dataframe with same columns


I have two dataframe A & B with same columns in both except primary key (in the real data I have more than 50+ column like that), Now I want to compare the 'summary' stastics (normal R summary() command) for both the dataframe but for comparison purpose I want to see it next to each other as given in the image attached.

So, please check the image in description to get better details

DATAFRAME DPUT STRUCTURE

structure(list(Pkey = c(1, 2, 3, 4, 5), Phy_marks = c(43, 44,  45,
    46, 47), Math_marks = c(34, 34, 45, 32, 21)), .Names = c("Pkey", 
    "Phy_marks", "Math_marks"), row.names = c(NA, -5L), class =
    "data.frame")

structure(list(Pkey = c(11, 12, 13, 14, 15), Phy_marks = c(43,  44,
    45, 46, 47), Math_marks = c(34, 34, 45, 32, 21)), .Names = c("Pkey",
    "Phy_marks", "Math_marks"), row.names = c(NA, -5L), class =
    "data.frame")

PLEASE HELP!!!


Solution

  • You can utilize the function I created below to compare the two data sets.

    library(dplyr)
    compare_them <- function(data1,data2) {
      sum1 <- apply(data1,2,summary) %>% data.frame() 
      sum2 <- apply(data2,2,summary) %>% data.frame() 
    
      names(sum1) <- paste0(names(sum1),"1")
      names(sum2) <- paste0(names(sum2),"2")
    
      final <- cbind(sum1,sum2)
    
      final1 <- t(final) 
    
      final2 <- final1[order(row.names(final1)), ]
    
      final_1 <- t(final2) %>% data.frame()
      final_1
    }
    
    compare_them(mtcars,mtcars*2) %>% View()
    

    data1 variables will have "1" in the end, data2 variables will have "2" in the end. I used mtcars and mtcars*2 as an example. The end results looks like this. enter image description here