I know it might be a silly question, but I was curious if there was any difference, I like more using str_detect because the syntax makes more sense in my brain.
Yes there are substantial differences. First, contains()
is a "selection helper" that must be used within a (generally tidyverse) selecting function.
So you cant work with vectors or use contains()
as a standalone function - ie, you can't do:
x <- c("Hello", "and", "welcome (example)")
tidyselect::contains("Hello", x)
Or you get the error:
Error: !
contains()
must be used within a selecting function.
Whereas stringr::str_detect
can work with vectors and as a standalone function:
stringr::str_detect(x, "Hello")
Returns:
[1] TRUE FALSE FALSE
Secondly, stringr::str_detect()
allows for regex, and tidyselect::contains
only looks for literal strings.
So for example, the below works
df <- data.frame(col1 = c("Hello", "and", "welcome (example)"))
df %>%
select(contains("1"))
# col1
# 1 Hello
# 2 and
# 3 welcome (example)
But this does not:
df %>% select(contains("\\d"))
(\\d
is the R regex for "any digit")
Additionally, as noted by @abagail, contains
looks at column names, not at the values stored within the columns. For instance, df %>% filter(contains("1"))
worked above to return the column col1
(since there was a "1" in the column name). But trying to filter
on the values that contain a certain pattern does not work:
df %>%
filter(contains("Hello"))
Returns the same error:
Caused by error: !
contains()
must be used within a selecting function.
But you can filter on the values in the columns using stringr::str_detect()
:
df %>%
filter(stringr::str_detect(col1, "Hello"))
# col1
# 1 Hello
Lastly, if you are looking for similar functions outside of stringr
, since tidyselect::matches()
will accept regex, @GregorThomas aptly points out in the comments,
"
tidyselect::matches
is a much closer analog tostr_detect()
--though still as a selection helper is is only for use within a selecting function."
str_detect
is also equivalent to base R's grepl
, though the orientation of the pattern and string are reversed (ie, str_detect(string, pattern)
is equivalent to grepl(pattern, string)