Search code examples
rconditional-statementsdummy-variable

How to create a dummy variable that takes the value 1 if all values of a variable within a group exceed a certain value


I have a data set like the one below:

dat <- data.frame (id  = c(1,1,1,1,1,2,2,2,2,2),
                   year = c(2015, 2016, 2017,2018, 2019, 2015, 2016, 2017, 2018, 2019),
                   ratio=c(0.6,0.6,0.65,0.7,0.8,0.4,1,0.5,0.3,0.7))

dat
   id year ratio
1   1 2015  0.60
2   1 2016  0.60
3   1 2017  0.65
4   1 2018  0.70
5   1 2019  0.80
6   2 2015  0.40
7   2 2016  1.00
8   2 2017  0.50
9   2 2018  0.30
10  2 2019  0.70

I'd like to create a dummy variable that takes the value of 1 if all the values of the ratio variable within each id exceed 0.5. The resulting data set should be as follows:

dat
   id year ratio dummy
1   1 2015  0.60     1
2   1 2016  0.60     1
3   1 2017  0.65     1
4   1 2018  0.70     1
5   1 2019  0.80     1
6   2 2015  0.40     0
7   2 2016  1.00     0
8   2 2017  0.50     0
9   2 2018  0.30     0
10  2 2019  0.70     0

Any help is much appreciated. Thank you!


Solution

  • You already have a base R answer in comments. Here is a dplyr and data.table version -

    library(dplyr)
    
    dat %>%
      group_by(id) %>%
      mutate(dummy = as.integer(all(ratio > 0.5))) %>%
      ungroup
    
    #      id  year ratio dummy
    #   <dbl> <dbl> <dbl> <int>
    # 1     1  2015  0.6      1
    # 2     1  2016  0.6      1
    # 3     1  2017  0.65     1
    # 4     1  2018  0.7      1
    # 5     1  2019  0.8      1
    # 6     2  2015  0.4      0
    # 7     2  2016  1        0
    # 8     2  2017  0.5      0
    # 9     2  2018  0.3      0
    #10     2  2019  0.7      0
    

    data.table -

    library(data.table)
    
    setDT(dat)[, dummy := as.integer(all(ratio > 0.5)), id]
    dat