I'm trying to write code which adds a new column to a dataframe, returning a pattern which has been matched to the respective cell in a different column.
For example, I have a column where the values are a string with a mix of useful and non-useful information, like this:
data.frame(A = c("148apple32394", "386pear3", "23banana3808"))
A
1 148apple32394
2 386pear3
3 23banana3808
I would like to compare this column to a vector of possible patterns, ie:
patterns <- c("apple", "banana", "pear")
and return a new column containing whatever pattern matched, the end result being:
A B
1 148apple32394 apple
2 386pear3 pear
3 23banana3808 banana
I know grep
doesn't work well with vectors of patterns, so is there another good function which might work? Ideally I would like to implement the solution using mutate()
Thanks!
You could use str_extract
with the patterns that are collapsed by |
to detect and extract the patterns like this:
df = data.frame(A = c("148apple32394", "386pear3", "23banana3808"))
patterns <- c("apple", "banana", "pear")
library(dplyr)
library(stringr)
df %>%
mutate(B = str_extract(A, paste(patterns, collapse = "|")))
#> A B
#> 1 148apple32394 apple
#> 2 386pear3 pear
#> 3 23banana3808 banana
Created on 2023-03-10 with reprex v2.0.2