Search code examples
rdataframedplyrfrequency

Frequency of values per column in table


What is a good way to get the independent frequency counts of multiple columns using dplyr? I want to go from a table of values:

# A tibble: 7 x 4
      a     b     c     d
  <int> <int> <int> <int>
1     1     2     1     3
2     1     2     1     3
3     2     2     5     3
4     3     2     4     3
5     3     3     2     3
6     5     3     4     3
7     5     4     2     1

to a frequency table like so:

# A tibble: 5 x 5
      x   a_n   b_n   c_n   d_n
  <int> <int> <int> <int> <int>
1     1     2     0     2     1
2     2     1     4     2     0
3     3     2     2     0     6
4     4     0     1     2     0
5     5     2     0     1     0

I'm still trying to get my head around dplyr, but it seems like this is something it could do. If it is easier to do with an add-on library, that is fine too.


Solution

  • library(dplyr)
    library(reshape2)
    df %>%
      melt() %>%
      dcast(value ~ variable, fun.aggregate=length)
    
    #   value a b c d
    # 1     1 2 0 2 1
    # 2     2 1 4 2 0
    # 3     3 2 2 0 6
    # 4     4 0 1 2 0
    # 5     5 2 0 1 0
    

    Data

    df <- structure(list(a = c(1L, 1L, 2L, 3L, 3L, 5L, 5L), b = c(2L, 2L, 
    2L, 2L, 3L, 3L, 4L), c = c(1L, 1L, 5L, 4L, 2L, 4L, 2L), d = c(3L, 
    3L, 3L, 3L, 3L, 3L, 1L)), .Names = c("a", "b", "c", "d"), class = "data.frame", row.names = c("1", 
    "2", "3", "4", "5", "6", "7"))