I have a colour column that sometimes contains a single colour reference, and sometimes contains multiple colours, which are separated by "||"
library(tidyverse)
id <- c(1:10)
colour <- c("sky-blue","blood-red","lavender-purple",
"sky-blue||blood-red", "midnight-blue", "blood-red||lavender-purple||sky-blue",
"grass-green","sky-blue||blood-red||lavender-purple||midnight-blue",
"grass-green","grass-green||midnight-blue")
df <- tibble("id" = id,
"colour" = colour)
# A tibble: 10 × 2
id colour
<int> <chr>
1 1 sky-blue
2 2 blood-red
3 3 lavender-purple
4 4 sky-blue||blood-red
5 5 midnight-blue
6 6 blood-red||lavender-purple||sky-blue
7 7 grass-green
8 8 sky-blue||blood-red||lavender-purple||midnight-blue
9 9 grass-green
10 10 grass-green||midnight-blue
I would like to separate those colours into individual columns, such that each column only contains one colour, then I would like to stack the colours with duplicate ids. (using gather()
) The name of the new colour columns is rather irrelevant, so I went for "col_1", "col_2", etc. , since I will then stack them again. However, if I run separate()
, it does the following:
df %>%
separate(colour, into = c("col_1","col_2","col_3","col_4"), sep = "||")
# A tibble: 10 × 5
id col_1 col_2 col_3 col_4
<int> <chr> <chr> <chr> <chr>
1 1 "" s k y
2 2 "" b l o
3 3 "" l a v
4 4 "" s k y
5 5 "" m i d
6 6 "" b l o
7 7 "" g r a
8 8 "" s k y
9 9 "" g r a
10 10 "" g r a
This also happens if I run it on a single row with the exact right number of columns in the into=
I have looked at some solutions, but haven't found something that covers irregular separator occurrence, and irregular expression length. Any solution would be most welcome.
You need to escape special symbols with \\
.
So try:
df %>%
separate_rows(colour, sep = "\\|\\|") %>%
mutate(ind = 1) %>%
pivot_wider(
names_from = colour,
values_from = ind,
values_fill = 0
)
Output is:
# A tibble: 10 × 6
id `sky-blue` `blood-red` `lavender-purple` `midnight-blue` `grass-green`
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 0 0 0 0
2 2 0 1 0 0 0
3 3 0 0 1 0 0
4 4 1 1 0 0 0
5 5 0 0 0 1 0
6 6 1 1 1 0 0
7 7 0 0 0 0 1
8 8 1 1 1 1 0
9 9 0 0 0 0 1
10 10 0 0 0 1 1