Search code examples
rcountfrequency

Convert a data frame column into a frequency distribution in R


I have recently started to work on some statistical problems in R and I have a query. I normally code in python and find the "collections.Counter" function quite useful. However I did not find any such equivalent command in R which was surprising since frequencies are used a lot in statistics.

for e.g. I have this table (data frame) -

df ->

c1          c2
reading1    2
reading2    3
reading3    1
reading4    3
reading5    2
reading6    4
reading7    1
reading8    2
reading9    4
reading10   5 

and I want to get this in R-

value    frequency
    1    2
    2    3
    3    2
    4    2
    5    1

I hope this illustrates what I would like to do.. Any help is appreciated

and for illustration purposes - In python I could do this -

df_c2 = [2,3,1,3,2,4,1,2,4,5]
counter=collections.Counter(df$c2)
print (counter)

and get this - Counter({2: 3, 1: 2, 3: 2, 4: 2, 5: 1})
which I can manipulate using loops.

Solution

  • The simplest way is to use table(), which returns a named vector():

    > table(df$c2)
    
    1 2 3 4 5 
    2 3 2 2 1 
    

    You can return a data.frame like this:

    > data.frame(table(df$c2))
      Var1 Freq
    1    1    2
    2    2    3
    3    3    2
    4    4    2
    5    5    1
    

    You can, of course, also use packages like the "tidyverse".

    library(tidyverse)
    df %>% 
      select(c2) %>% 
      group_by(c2) %>% 
      summarise(freq = n())
    # # A tibble: 5 x 2
    #      c2  freq
    #   <int> <int>
    # 1     1     2
    # 2     2     3
    # 3     3     2
    # 4     4     2
    # 5     5     1