Search code examples
rduplicatesrows

Change column value from character to number or duplicate row in R


I'm having an issue with a variable in some data that I have in R. I have a table that looks something like this,

Variable X Variable Y
11 [1]
15 [400]
17 [1,2]
21 [13,14]

What I want to do, is that for every entry in 'Variable Y', if there is only one number in the square brackets, I want to get rid of the square brackets. If there are more than one number, I want to get rid of the brackets and then duplicate the entire row, just changing the value in 'Variable Y' in each duplicate. I basically want this table to look like this,

Variable X Variable Y
11 1
15 400
17 1
17 2
21 13
21 14

I've been able to convert the single bracket entries to numbers using the parse number function in the readr package, but it's very slow so I'd like to improve it. I also have no idea how I can get the duplicates based on more than one entry in the square brackets. Any help would be greatly appreciated, thank you.


Solution

  • You can first remove the square brackets and then separate_rows:

    library(dplyr)
    library(tidyr)
    df %>%
      mutate(Y = gsub("\\[|\\]", "", Y)) %>%
      separate_rows(Y, sep = ",")
    # A tibble: 6 × 2
          X Y    
      <dbl> <chr>
    1    11 1    
    2    15 400  
    3    17 1    
    4    17 2    
    5    21 13   
    6    21 14