Search code examples
sqldatabaserprepared-statementrsqlite

RSQLite: binding sets and scalars in the same select query


How can I pass both a scalar and a set of values in the bind.data parameter of dbGetQuery() for an SQL statement like

select * from tst where x = ? and y in (?)

Here's what I tried:

> library("RSQLite")
> c <- dbConnect (SQLite())
> dbGetQuery(c, "create table tst (x int, y int)")
> dbGetQuery(c, "insert into tst values (?, ?)", data.frame(x=c (1,2,1,2), y=c(3, 4, 5, 6)))
> dbReadTable(c, "tst")
  x y
1 1 3
2 2 4
3 1 5
4 2 6
> dbGetQuery (c, "select * from tst where x = ? and y not in (?)", data.frame(x=2, y=I (list(7,6))))
Error in sqliteFetch(rs, n = -1, ...) : 
  RAW() can only be applied to a 'raw', not a 'double'

From reading the source, any non-data.frame bind.data parameter gets forced through as.data.frame(), so I guess there's little point in trying anything other than data frames.

NOTE: heck, it seems even binding a single set is problematic:

> dbGetQuery(c, "select * from tst where y not in (?)", c(7,6))
  x y
1 1 3
2 2 4
3 1 5
4 2 6
5 1 3
6 2 4
7 1 5

This makes it clear that 2 separate queries (one of which returns 4 and one of which returns 3 results) are send from R; SQLite never sees a set parameter.

Earlier note: I want the database engine to filter the appropriate row, i don't want R to compute the cartesian product. In the above example, simply getting rid of I() creates a 2-row dataframe (thanks to R's recycling), one of which is the solution. R sends each of these 2 rows to sqlite, and of course the second one matches. But the following shows the SQLite engine doesn't actually receive set parameters with regular data.frames:

> dbGetQuery(c, "select * from tst where x in (?) and y in (?)", data.frame(x=c(3,2), y=c(6,7)))
[1] x y
<0 rows> (or 0-length row.names)
> dbGetQuery(c, "select * from tst where x in (?) and y in (?)", data.frame(x=c(3,2), y=c(7,6)))
  x y
1 2 6

Solution

  • Why do you specify y = I(list(7,6)) instead of y=c(6,7)? This seems to work:

    dbGetQuery (c, 
                "select * from tst where x = ? and y in (?)", 
                data.frame(x=1, y=c(7,6)))
    

    You might be looking for expand.grid.

    dbGetQuery (c, 
                "select * from tst where x = ? and y in (?)", 
                expand.grid(x=c(2,3), y=c(7,6)))
    

    EDIT: Another option (and it's not pretty) is to substitute the ? in R. Something like the following:

    dbGetQuerySet <- function(con, statement, ...){
      if (length(list(...)) > 0){
        bind.data <- list(...)[[1]]
        for (set in as.data.frame(bind.data)){  
          statement <- sub('\\?', paste(set, collapse=","), statement)
        }
      }
      sqliteQuickSQL(con, statement, ...)
    }