Search code examples
rregexstringrpiping

Subset vector not containing word in piped operation in R (regex)


How do I subset a vector for elements that do not contain a word in a piped operation? (I'm really into piping)

I'm hoping there's some way to invert str_subset. In the following example, I'd like to just return the second element of x instead of the elements with hi in them:

library(stringr)
x <- c("hi", "bye", "hip")
x %>% 
    str_dup(2) %>%  # just an example operation
    str_subset("hi")  # I want to return the inverse of this

Solution

  • You can use ^(?!.*hi) to assert string not contain hi; The regex uses negative look ahead ?! and asserts the string doesn't contain a pattern .*hi:

    x %>% 
        str_dup(2) %>%  # just an example operation
        str_subset("^(?!.*hi)")  
    # [1] "byebye"
    

    Or filter by reversing str_detect:

    x %>% 
        str_dup(2) %>%  # just an example operation
        {.[!str_detect(., "hi")]}  
    # [1] "byebye"