Search code examples
rdataframecountfrequency

Frequency count based on two columns in r


I have just one dataframe as below.

df=data.frame(o=c(rep("a",12),rep("b",3)), d=c(0,0,1,0,0.3,0.6,0,1,2,3,4,0,0,1,0))

> df
   o   d
1  a 0.0
2  a 0.0
3  a 1.0
4  a 0.0
5  a 0.3
6  a 0.6
7  a 0.0
8  a 1.0
9  a 2.0
10 a 3.0
11 a 4.0
12 a 0.0
13 b 0.0
14 b 1.0
15 b 0.0

I want to add a new column that counts frequency based on both columns 'o' and 'd'. And the frequency should start again from 1 if the value of column 'd' is zero like below(hand-made).

> df_result
   o   d freq
1  a 0.0    1
2  a 0.0    2
3  a 1.0    2
4  a 0.0    3
5  a 0.3    3
6  a 0.6    3
7  a 0.0    5
8  a 1.0    5
9  a 2.0    5
10 a 3.0    5
11 a 4.0    5
12 a 0.0    1
13 b 0.0    2
14 b 1.0    2
15 b 0.0    1

enter image description here


Solution

  • In base R, use ave :

    df$freq <- with(df, ave(d, cumsum(d == 0), FUN = length))
    df
    
    #   o   d freq
    #1  a 0.0    1
    #2  a 0.0    2
    #3  a 1.0    2
    #4  a 0.0    3
    #5  a 0.3    3
    #6  a 0.6    3
    #7  a 0.0    5
    #8  a 1.0    5
    #9  a 2.0    5
    #10 a 3.0    5
    #11 a 4.0    5
    #12 a 0.0    1
    #13 b 0.0    2
    #14 b 1.0    2
    #15 b 0.0    1
    

    With dplyr :

    library(dplyr)
    df %>% add_count(grp = cumsum(d == 0))