Search code examples
sqlrregexdbidbplyr

Use variable with regex in string::str_detect in dbplyr SQL query


I would like to filter a SQL database based whether a regular expression appears within any column. I would like to specify the regex as a variable; however it is read as a literal string. I am having trouble getting the regex in as a variable. Thank you for your help!

Resources I've consulted:

Note: I had trouble making a reprex using the mtcars dataset, following https://www.tidyverse.org/blog/2018/01/dbplyr-1-2/. I get the error: "Error: str_detect() is not available in this SQL variant". I cannot share a reprex using my actual SQL database. As such, below is a pseudo-reprex.

library(dplyr)
library(stringr)

# Variable with regex (either lower- or uppercase "m")
my_string <- "(?i)m"

# WITHOUT SQL DATABASE ----------------------------------------------------

# This runs
mtcars %>% 
  tibble::rownames_to_column() %>% 
  filter(str_length(rowname) > 5)

# This runs with STRING
mtcars %>% 
  tibble::rownames_to_column() %>% 
  filter(str_detect(rowname, "(?i)m"))

# This runs with VARIABLE
mtcars %>% 
  tibble::rownames_to_column() %>% 
  filter(str_detect(rowname, my_string))

# WITH SQL DATABASE -------------------------------------------------------

con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
mtcars_db <- copy_to(con, tibble::rownames_to_column(mtcars), "mtcars")

# This runs
tbl(con, "mtcars") %>% 
  filter(str_length(rowname) > 5)

# This *should* run with STRING -- pretend it does ;)
tbl(con, "mtcars") %>%
  filter(str_detect(rowname, "M"))

# This does NOT run with VARIABLE
tbl(con, "mtcars") %>%
  filter(str_detect(rowname, my_string))

Solution

  • With the help of a colleague, I have a solution to force evaluation of a variable with regex in string::str_detect:

    tbl(con, "mtcars") %>%
      filter(str_detect(rowname, {{my_string}}))