Search code examples
rloopsstatisticsdummy-variablegenerate

How to use while loop to creat a dummy variable based on an existing variable?


I have a categorical variable called "X1" and a dummy variable called "X2". Now I want to create a dummy variable X3 in a way that follows this logic:

If in any rows of any categories of X1, at least one row gets X2=1, then put X3=1 for all the rows of that category, otherwise 0.

X1<-c(1,1,2,2,,3,3)
X2<-c(0,1,0,0,1,1)

The desired output I am looking for is like this:

X1 X2  X3
 1  0   1
 1  1   1
 2  0   0
 2  0   0
 3  1   1
 3  1   1

I appreciate any help on this.


Solution

  • Here's a dplyr solution:

    df = data.frame(
      X1 = c(1,1,2,2,3,3),
      X2 = c(0,1,0,0,1,1)
    )
    
    
    library(dplyr)
    df %>%
      group_by(X1) %>%
      mutate(X3 = ifelse(1 %in% X2, 1, 0))
    # # A tibble: 6 x 3
    # # Groups:   X1 [3]
    #      X1    X2    X3
    #   <dbl> <dbl> <dbl>
    # 1     1     0     1
    # 2     1     1     1
    # 3     2     0     0
    # 4     2     0     0
    # 5     3     1     1
    # 6     3     1     1
    

    Here's the same idea in base R:

    df$X3 = with(df, ave(X2, X1, FUN = function(x) ifelse(1 %in% x, 1, 0)))
    df
    #   X1 X2 X3
    # 1  1  0  1
    # 2  1  1  1
    # 3  2  0  0
    # 4  2  0  0
    # 5  3  1  1
    # 6  3  1  1