I have the following data.frame: data frame
I'm using dplyr
and stringr
, and I want to filter the column Nombre
in the following way: retain all rows that contain "regimen" or "promocion" or "REGIMEN" or "PROMOCION", i.e., both in uppercase and lowercase. I tried:
str_view(df$Nombre, regex("regimen|promocion", ignore_case=T))
but in that case, it only retains the first word (regimen) both in upper and lower case. If I remove ignore_case=T, it finds both "regimen" and "promocion" but case sensitive, i.e., only lowercase.
Of course, this is an example, I need to filter lots of words, not just "regimen" and "promocion", that's why I don't filter each word separately.
Since the data seems to be in Spanish, I would use a regexp a bit more sofisticated (able to catch accents too).
library(tidyverse)
df <- data.frame(
N = c(100, 12345, 666, 888),
Nombre = c("RÉGIMEN", "promoción", "ley", "otro regimen")
)
df %>%
filter(str_detect(Nombre, regex("r\\wgimen|promoci\\wn", ignore_case = TRUE)))
#> N Nombre
#> 1 100 RÉGIMEN
#> 2 12345 promoción
#> 3 888 otro regimen