Search code examples

s3 is there a way to combine prop.table for character variables?

Noob here, I'm stuck trying to use S3 to summarise proportion data for a data.frame where there are four columns of character data. My goal is to build a summary method to show the proportions for every level of every variable at one time.

I can see how to get the propotion for each column

a50survey1 <- table(Student1995$alcohol)
a50survey2 <- table(Student1995$drugs)
a50survey3 <- table(Student1995$smoke)
a50survey4 <- table(Student1995$sport)

                  Not  Once or Twice a week          Once a month           Once a week More than once a week 
                 0.10                  0.32                  0.24                  0.28                  0.06 

But I cannot find a way to combine all of the prop.table outputs into one summary output. Unless I'm really wrong. I cannot find a S3 method like summary.prop.table which would work for me. The goal is to set up for the current data frame and then drop in new same size & observations data frames in the future.

I'm really a step by step guy and if you can help me, that would be great - thank you

Dataframe info here. There are four columns, where each column has a different number of catagorical options for obersvations.

> dput(head(Student1995,5))
structure(list(alcohol = structure(c(3L, 2L, 2L, 2L, 3L), .Label = c("Not", 
"Once or Twice a week", "Once a month", "Once a week", "More than once a week"
), class = "factor"), drugs = structure(c(1L, 2L, 1L, 1L, 1L), .Label = c("Not", 
"Tried once", "Occasional", "Regular"), class = "factor"), smoke = structure(c(2L, 
3L, 1L, 1L, 1L), .Label = c("Not", "Occasional", "Regular"), class = "factor"), 
    sport = structure(c(2L, 1L, 1L, 2L, 2L), .Label = c("Not regular", 
    "Regular"), class = "factor")), row.names = c(NA, 5L), class = "data.frame")

The Summary data if it helps - edit

> summary(Student1995)
                  alcohol          drugs           smoke            sport   
 Not                  : 5   Not       :36   Not       :38   Not regular:13  
 Once or Twice a week :16   Tried once: 6   Occasional: 5   Regular    :37  
 Once a month         :12   Occasional: 7   Regular   : 7                   
 Once a week          :14   Regular   : 1                                   
 More than once a week: 3 


  • Maybe this is what you wanted. Values in each category sum up to 100%.

    lis <- sapply( Student1995, function(x) t( sapply( x, table ) ) )
    sapply( lis, function(x) colSums(prop.table(x)) )
                      Not  Once.or.Twice.a.week          Once.a.month
                      0.0                   0.6                   0.4
              Once.a.week More.than.once.a.week
                      0.0                   0.0
           Not Tried.once Occasional    Regular
           0.8        0.2        0.0        0.0
           Not Occasional    Regular
           0.6        0.2        0.2
    Not.regular     Regular
            0.4         0.6

    and the whole summary...

    prop.table( table(as.vector( sapply( Student1995, unlist ))) )
                     Not          Not regular           Occasional
                    0.35                 0.10                 0.05
            Once a month Once or Twice a week              Regular
                    0.10                 0.15                 0.20
              Tried once