I am trying to split sentences based on different criteria. I am looking to split some sentences after "traction" and some after "ramasse". I looked up the grammar rules for grepl but didn't really understand.
A data frame called export
has a column ref
, which has str values ending either with "traction" or "ramasse".
>export$ref
ref
[1] "62133130_074_traction"
[2] "62156438_074_ramasse"
[3] "62153874_070_ramasse"
[4] "62138861_074_traction"
And I want to split str values in ref column into two.
ref R&T
[1] "62133130_074_" "traction"
[2] "62156438_074_" "ramasse"
[3] "62153874_070_" "ramasse"
[4] "62138861_074_" "traction"
What I tried(none of them was good)
strsplit(export$ref, c("traction", "ramasse"))
strsplit(export$ref, "\\_(?<=\\btraction)|\\_(?<=\\bramasse)", perl = TRUE)
strsplit(export$ref, "(?=['traction''ramasse'])", perl = TRUE)
Any help would be appreciated!
Here is another option using stringr::str_split
:
library(stringr);
str_split(ref, pattern = "_(?=[A-Za-z]+)", simplify = T)
# [,1] [,2]
#[1,] "62133130_074" "traction"
#[2,] "62156438_074" "ramasse"
#[3,] "62153874_070" "ramasse"
#[4,] "62138861_074" "traction"
ref <- c(
"62133130_074_traction",
"62156438_074_ramasse",
"62153874_070_ramasse",
"62138861_074_traction")