I'm having an issue with a variable in some data that I have in R. I have a table that looks something like this,
Variable X | Variable Y |
---|---|
11 | [1] |
15 | [400] |
17 | [1,2] |
21 | [13,14] |
What I want to do, is that for every entry in 'Variable Y', if there is only one number in the square brackets, I want to get rid of the square brackets. If there are more than one number, I want to get rid of the brackets and then duplicate the entire row, just changing the value in 'Variable Y' in each duplicate. I basically want this table to look like this,
Variable X | Variable Y |
---|---|
11 | 1 |
15 | 400 |
17 | 1 |
17 | 2 |
21 | 13 |
21 | 14 |
I've been able to convert the single bracket entries to numbers using the parse number function in the readr package, but it's very slow so I'd like to improve it. I also have no idea how I can get the duplicates based on more than one entry in the square brackets. Any help would be greatly appreciated, thank you.
You can first remove the square brackets and then separate_rows
:
library(dplyr)
library(tidyr)
df %>%
mutate(Y = gsub("\\[|\\]", "", Y)) %>%
separate_rows(Y, sep = ",")
# A tibble: 6 × 2
X Y
<dbl> <chr>
1 11 1
2 15 400
3 17 1
4 17 2
5 21 13
6 21 14