Search code examples
sqlrpostgresqllistwhere-in

Query if value in list using R and PostgreSQL


I have a dataframe like this

df1

ID  value
1   c(YD11,DD22,EW23)
2   YD34
3   c(YD44,EW23)
4 

And I want to query another database to tell me how many rows had these values in them. This will eventually be done in a loop through all rows but for now I just want to know how to do it for one row.

Let's say the database looks like this:

sql_database

value  data
YD11   2222
WW20   4040
EW23   2114
YD44   3300
XH29   2040

So if I just looked at row 1, I would get:

dbGetQuery(con,
       sprintf("SELECT * FROM sql_database WHERE value IN %i", 
               df1$value[1]) %>%
  nrow()


OUTPUT:  
2

And the other rows would be :

Row 2: 0

Row 3: 2

Row 4: 0

I don't need the loop created but because my code doesn't work I would like to know how to query all rows of a table which have a value in an R list.


Solution

  • You do not need a for loop for this.

    library(tidyverse)
    library(DBI)
    library(dbplyr)
    
    
    df1 <- tibble(
      id = 1:4,
      value = list(c("YD11","DD22","EW23"), "YD34", c("YD44","EW23"), NA)
    )
    
    # creating in memory database table
    df2 <- tibble(
      value = c("YD11", "WW20", "EW23", "YD44", "XH29"),
      data = c(2222, 4040, 2114, 3300, 2040)
    )
    
    con <- dbConnect(RSQLite::SQLite(), ":memory:")
    # Add auxilary schema
    tmp <- tempfile()
    DBI::dbExecute(con, paste0("ATTACH '", tmp, "' AS some_schema"))
    copy_to(con, df2, in_schema("some_schema", "some_sql_table"), temporary = FALSE)
    
    # counting rows
    df1 %>%
      unnest(cols = c(value)) %>%
      left_join(tbl(con, dbplyr::in_schema("some_schema", "some_sql_table")) %>% collect(), by = "value") %>%
      mutate(data = if_else(is.na(data), 0, 1)) %>%
      group_by(id) %>%
      summarise(n = sum(data))