Search code examples
rpercentagesummary

Simple way to make a percentage table for discrete data with more than two groups?


Is there any easy way to make a such a table for discrete in R: 1. the value in table is percentage for each row. 2. divide by more than two groups. For example, the data is

Success Gender Level
1       M    High
1       M    Low
1       F    Med
0       M    Low
0       M    Med
0       F    High

The desired table looks like this

                 Success=1                              Success=0
        Level=High  Level=Med  Level=Low       Level=High  Level=Med  Level=Low
Gender=F   0           0.5         0               0.5           0          0
Gender=M   0.25        0           0.25            0             0.25       0.25

Solution

  • You can use ftable() with prop.table(). Specifying row.vars as the second column will produce a table that looks like your desired table (in a slightly different order).

    prop.table(ftable(df, row.vars = 2), margin = 1)
    #        Success    0              1          
    #        Level   High  Low  Med High  Low  Med
    # Gender                                      
    # F              0.50 0.00 0.00 0.00 0.00 0.50
    # M              0.00 0.25 0.25 0.25 0.25 0.00
    

    For the exact desired table, you can refactor columns to change the order of the levels.

    df2 <- transform(
        df, 
        Level = factor(Level, levels = c("High", "Med", "Low")),
        Success = factor(Success, levels = 1:0)
    )
    
    prop.table(ftable(df2, row.vars = 2), margin = 1)
    #        Success    1              0          
    #        Level   High  Med  Low High  Med  Low
    # Gender                                      
    # F              0.00 0.50 0.00 0.50 0.00 0.00
    # M              0.25 0.00 0.25 0.00 0.25 0.25
    

    Data:

    df <- structure(list(Success = c(1L, 1L, 1L, 0L, 0L, 0L), Gender = structure(c(2L, 
    2L, 1L, 2L, 2L, 1L), .Label = c("F", "M"), class = "factor"), 
        Level = structure(c(1L, 2L, 3L, 2L, 3L, 1L), .Label = c("High", 
        "Low", "Med"), class = "factor")), .Names = c("Success", 
    "Gender", "Level"), class = "data.frame", row.names = c(NA, -6L
    ))