I have a dataframe in R as below:
Fruits
Apple:1
Apple:4
Bananna
Papaya
Orange, Apple:2
I want to filter rows with string Apple as
Apple:1
Apple:4
I tried using dplyr package.
df <- dplyr::filter(df, grepl('Apple', Fruits))
But it filters rows with string Apple as:
Apple:1
Apple: 4
Orange, Apple:2
How to remove rows with multiple strings and filter rows with one specific string (in this case Apple)?
EDIT:
Assuming, based on comments made by OP, that strings should be filtered where the only fruit mentioned is Apple
and assuming further that the list of non-Apple
fruit is manageable, you could do this:
df %>%
filter(str_detect(Fruits, '^(?!.*Banana|Orange).*Apple'))
Fruits
1 Apple, Apple:2, Apple:7
Here, we use negative look-ahead (?!.*Banana|Orange)
to assert that Banana
or Orange
must not be present in the string together with Apple
Data:
df <- data.frame(
Fruits = c("Orange, Apple:2",
"Apple, Apple:2, Apple:7",
"Apple:2, Banana:10"))