I have a dataframe like below:
Col1 Col2
A 5!5!!6!!3!!m
B 7_8!!6!!7!!t
structure(list(Col1 = c("A", "B"), Col2 = c("5!5!!6!!3!!m", "7_8!!6!!7!!t" )), class = "data.frame", row.names = c(NA, -2L))
How do I create a new column that extracts the 3rd parse of the strings found in Col2?
In SQL I am using SPLIT_PART function:
SPLIT_PART(Col2, '!!', 3)
I am looking to find an equivalent function in R.
Expected output:
Col1 Col2 Col3
A 5!5!!6!!3!!m 3
B 7_8!!6!!7!!t 7
Here's a tidyverse
option, though the core is funcitonally identical to Rushabh's data.table based answer.
When given the simplify=T
argument, stringr::str_split
will output a matrix, with each match in a column. You can subset the desired column from it to extract the desired position:
library(tidyverse)
df1 %>%
mutate(Col3 = str_split(Col2, pattern = '!!', simplify=T)[,3])
Col1 Col2 Col3
1 A 5!5!!6!!3!!m 5!5
2 B 7_8!!6!!7!!t 7_8
df1 %>%
mutate(Col3 = str_split(Col2, pattern = '!!', simplify=T)[,2])
Col1 Col2 Col3
1 A 5!5!!6!!3!!m 6
2 B 7_8!!6!!7!!t 6
df1 %>%
mutate(Col3 = str_split(Col2, pattern = '!!', simplify=T)[,1])
Col1 Col2 Col3
1 A 5!5!!6!!3!!m 5!5
2 B 7_8!!6!!7!!t 7_8