I have a dataset with a column with the symbol '|' (come from the interaction of 2 variables in a model), and I want to split it according this character.
The function separate works well with standard character, do you how I can specific the character '|' ?
library(tidyverse)
df <- data.frame(Interaction = c('var1|var2'))
# as expected
df %>% separate(Interaction, c('var1', 'var2'), sep = '1')
# var1 var2
# 1 var |var2
# not as expected
df %>% separate(Interaction, c('var1', 'var2'), sep = '|')
# var1 var2
# 1 v
We can either escape (\\
) the |
as it is a metacharacter for regex specifying for OR
and the sep
by default is in the regex
mode
If we look at the ?separate
documentation,
separate(data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE, extra = "warn", fill = "warn", ...)
and it is described as
sep - If character, is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values.
df %>%
separate(Interaction, c('var1', 'var2'), sep = '\\|')
or place it in square brackets
df %>%
separate(Interaction, c('var1', 'var2'), sep = '[|]')