Search code examples
rfrequency

How do i combine 3 columns and get the frequency of them?


I currently have this table given to me:

   City_Category               Cat1               Cat2               Cat3
1               A                  3                  0                  0
2               A                  1                  6                 14
3               C                  4                  5                  9
4               A                 12                 14                  0
5               C                  8                  0                  0
6               A                  1                  2                  0
7               A                 12                 14                 17

The numbers under the category 1,2 and 3 are sort of the categories where the products fall under. How do I combine them together to make a table like the following?

    City_Category               Category               Freq
1               A                      0                  4
2               B                      0                  0
3               C                      0                  2
...
17              A                      14                 3

Solution

  • Try:

    library(dplyr)
    library(tidyr)
    
    gather(df, name, Category, -City_Category) %>%
      count(City_Category, Category, name = 'Freq')
    

    Output:

    # A tibble: 13 x 3
       City_Category Category  Freq
       <fct>            <int> <int>
     1 A                    0     4
     2 A                    1     2
     3 A                    2     1
     4 A                    3     1
     5 A                    6     1
     6 A                   12     2
     7 A                   14     3
     8 A                   17     1
     9 C                    0     2
    10 C                    4     1
    11 C                    5     1
    12 C                    8     1
    13 C                    9     1
    

    Data used:

    df <- structure(list(City_Category = structure(c(1L, 1L, 2L, 1L, 2L, 
    1L, 1L), .Label = c("A", "C"), class = "factor"), Cat1 = c(3L, 
    1L, 4L, 12L, 8L, 1L, 12L), Cat2 = c(0L, 6L, 5L, 14L, 0L, 2L, 
    14L), Cat3 = c(0L, 14L, 9L, 0L, 0L, 0L, 17L)), class = "data.frame", row.names = c("1", 
    "2", "3", "4", "5", "6", "7"))
    

    If for some reason you cannot work with the latest dplyr version, you can do:

    gather(df, name, Category, -City_Category) %>%
      count(City_Category, Category) %>%
      rename(Freq = n)