Search code examples
rdataframefrequency

How can I count the frequency of string by another column value in a dataframe R


A simplification of the dataframe which I'm working is:

> df1
         Any              nomMun
   1     2010             CADAQUES
   2     2011             CADAQUES
   3     2012             CADAQUES
   4     2010             BEGUR
   5     2011             BEGUR
   6     2012             BEGUR

I've been reading some post and found that count of plyr library returns a dataframe with the strings and it's frequency. But I want the frequency by year. The final result I want to obtain is a dataframe like:

> df2
         nomMun       freq_2010     freq_2011     freq_2012
   1     CADAQUES         1             1             1
   2     BEGUR            1             1             1

Could anyone you help me?

Sorry if my explanation is bad... i'm non-native speaker and it's my first time asking here...


Solution

  • In data.table, simply use .N:

    setDT(df1)
    df1[, .N, .(nomMun, Any)]
    

    This will give you the data in long format. In other words, it will look like:

    Any      nomMum      N
    2010     CADAQUES    1
    2011     CADAQUES    1
    2012     CADAQUES    1
    2010     BEGUR       1
    2011     BEGUR       1
    2012     BEGUR       1
    

    But then you can dcast it if you'd like:

    dcast(df1[, .N, .(nomMun, Any)], nomMum ~ Any, value.var = "N")