Search code examples
rdummy-variable

Converting multiple dummy variables that are not mutually exclusive into single categorical variable, adding new rows


I have the following data frame:

data <- data.frame(id=c(1, 1, 2, 2), 
task=c(1, 2, 1, 2),
strategy1=c("1", "1", "0", "1"),
strategy2=c("0", "0", "1", "1"),
strategy3=c("0", "1", "0", "1"))

My aim is to combine the dummy variables for the different strategies into a single categorical variable, 'strategy'. If multiple strategies were used by a participant during a task, new rows with the same 'id' and 'task' numbers must be created accordingly, as there should be only one 'strategy' variable.

For the given example, the data frame should finally look like this:

data_single <- data.frame(id=c(1, 1, 1, 2, 2, 2, 2),
task=c(1, 2, 2, 1, 2, 2, 2),
strategy=c("1", "1", "3", "2", "1", "2", "3"))

Can anyone show me how I can achieve this?


Solution

  • library(tidyr)
    library(dplyr)
    tidyr::pivot_longer(
      data, 
      cols = starts_with("strategy"),
      names_prefix = "strategy", 
      names_to = "strategy"
    ) %>%
      filter(value == 1) %>%
      select(-value)
    # # A tibble: 7 x 3
    #      id  task strategy
    #   <dbl> <dbl> <chr>   
    # 1     1     1 1       
    # 2     1     2 1       
    # 3     1     2 3       
    # 4     2     1 2       
    # 5     2     2 1       
    # 6     2     2 2       
    # 7     2     2 3