Search code examples
rcrosstab

Is there an R function similar to pandas.crosstab, generate joint frequency table with named attributes?


I'd like to create a frequency table iteratively, with a single variable [var1, Y] or [var2, Y] and joint variables [var1, var2, Y]

Codes below in R can only make the single frequency table and joint frequency table separately.

c1 <- ftable(variable[[1]], data1[,3])
#     Fund
# 
# b    21
# c   206
# d  1127

c1 <- ftable(variable[[3]], data1[,3])   
#     x.2   a   b   c   d
# x.1                    
# b         0   9   4   8
# c         0 116  51  39
# d         5 542 291 289

#variable[[3]] is a joint variable of variable[[1]] and variable[[2]]

as.matrix(as.vector(t(c1))) 
#       [,1]
# [1,]    0
# [2,]    9
# [3,]    4
# [4,]    8
# [5,]    0
# [6,]  116
# [7,]   51
# [8,]   39
# [9,]    5
# [10,]  542
# [11,]  291
# [12,]  289


ftable(variable[[1]], variable[[2]], data1[,3])
#       Fund
# 
# b a     0
# b       9
# c       4
# d       8
# c a     0
# b     116
# c      51
# d      39
# d a     5
# b     542
# c     291
# d     289

Is there a way to generate frequency tables together but also keep the named attribute?


Solution

  • You can use addmargins to add margins (row and column sums) to a table.

    For example:

    data(mtcars)
    
    addmargins(table(mtcars[c("cyl", "gear")]))
    #      gear
    # cyl    3  4  5 Sum
    #   4    1  8  2  11
    #   6    2  4  1   7
    #   8   12  0  2  14
    #   Sum 15 12  5  32
    
    ftable(addmargins(table(mtcars[c("cyl", "gear", "carb")])))
    #          carb  1  2  3  4  6  8 Sum
    # cyl gear                           
    # 4   3          1  0  0  0  0  0   1
    #     4          4  4  0  0  0  0   8
    #     5          0  2  0  0  0  0   2
    #     Sum        5  6  0  0  0  0  11
    # 6   3          2  0  0  0  0  0   2
    #     4          0  0  0  4  0  0   4
    #     5          0  0  0  0  1  0   1
    #     Sum        2  0  0  4  1  0   7
    # 8   3          0  4  3  5  0  0  12
    #     4          0  0  0  0  0  0   0
    #     5          0  0  0  1  0  1   2
    #     Sum        0  4  3  6  0  1  14
    # Sum 3          3  4  3  5  0  0  15
    #     4          4  4  0  4  0  0  12
    #     5          0  2  0  1  1  1   5
    #     Sum        7 10  3 10  1  1  32
    

    I first use table to create the table as addmargins expects the output of table and not ftable. In case of the three dimensional table, I finally use ftable to format the table in a more readable format.

    Generating all possible tables

    # Select columns interesting to use in table
    dta <- mtcars[c("cyl", "vs", "am", "gear", "carb")]
    
    # Generate all possible combinations of columns
    combinations <- unlist(lapply(1:ncol(dta), 
      function(x) combn(1:ncol(dta), x, simplify = FALSE)), recursive = FALSE)
    
    # For each combination calculate a table
    tables <- lapply(combinations, function(cols) ftable(dta[cols]))