Search code examples
rdataframesplitmutate

How to split values separated by commas in the same row to different rows in R


I have some data from a a Google Forms and I'd like to slipt the common-separated answers and duplicate the participant's ID

  • The data looks like this:
> head(data)
   names         Q2               Q3                          Q4
1 PART_1     fruits    bananas, apples        brocolli, lettuce, potatoes
2 PART_2 vegetables   bananas, oranges                    brocolli
3 PART_3     fruits                        carrots, brocolli, lettuce
  • Desired output #1 (filling in with Nas):
 names         Q2               Q3                          Q4
1 PART_1     fruits        bananas                brocolli  
  PART_1       NA          apples                 lettuce,
  PART_1       NA           NA                    potatoes
so on...
  • Desired output #2 (repeat the non-multiple choice answers (as Q1):
 names         Q2               Q3                          Q4
1 PART_1     fruits        bananas                   brocolli  
  PART_1     fruits           apples                 lettuce,
  PART_1     fruits           NA                    potatoes
so on...
  • If it's possible, a tidyverse solution would be much appreciated!

Obs: The ideia is pretty much like this SQL question. I've seen this R question, but I'd like to repeat the participant's name, not rename them

  • data:
structure(list(names = c("PART_1", "PART_2", "PART_3"), Q2 = c("fruits", 
"vegetables", "fruits"), Q3 = c("bananas, apples", "bananas, oranges", 
""), Q4 = c("brocolli, lettuce, potatoes", "brocolli", "carrots, brocolli, lettuce"
)), class = "data.frame", row.names = c(NA, -3L))

Solution

  • You can do:

    library(tidyr)
    library(dplyr)
    
    dat %>% 
      pivot_longer(-c(Q2, names)) %>%
      separate_rows(value) %>%
      group_by(names, name) %>%
      mutate(row = row_number()) %>%
      pivot_wider() %>%
      select(-row)
    
    # A tibble: 8 × 4
    # Groups:   names [3]
      names  Q2         Q3        Q4      
      <chr>  <chr>      <chr>     <chr>   
    1 PART_1 fruits     "bananas" brocolli
    2 PART_1 fruits     "apples"  lettuce 
    3 PART_1 fruits      NA       potatoes
    4 PART_2 vegetables "bananas" brocolli
    5 PART_2 vegetables "oranges" NA      
    6 PART_3 fruits     ""        carrots 
    7 PART_3 fruits      NA       brocolli
    8 PART_3 fruits      NA       lettuce