Search code examples
rdbirodbcdbplyr

Is there a 'cleaner' way to concatenate a query string?


I am passing a query string to a connection and the result should look like this:

select game_name, month, count(*) as count
  from device_metrics.mtu_events
 where YEAR = '2019' and month between '07' and '09'
 group by game_name, month
 order by game_name asc, month asc

This works fine if I pass the above block as a single string into DBI::dbGetQuery(con, myquery)

But the dates are a variable within a shiny app, so I tried to create a function to generate the query string:

my_query <- function(start_date, end_date) {
        yr <- year(ymd(start_date))
        month_start <- month(ymd(start_date))
        month_end <- month(ymd(end_date))

        query <- paste0(
            "select game_name, month, count(*) as count
   from device_metrics.mtu_events
   where YEAR = ", yr, " and month between ", month_start, " and ", month_end, 
            " group by game_name, month
 order by game_name asc, month asc")

        return(query)
    }

When I call this function and try to query our database with it I get:

An error has been thrown from the AWS Athena client. Athena Error No: 372, HTTP Response Code: 1, Error Message: SYNTAX_ERROR: line 3:15: '=' cannot be applied to varchar, integer

Is there a 'right' way to do this? How can I construct a query string with variables and pass then to DBI::dbGetQuery()


Solution

  • Here are two options where we can quote (') the input as a string as month, year functions return numeric values

    my_query <- function(start_date, end_date) {
            yr <- year(ymd(start_date))
            month_start <- month(ymd(start_date))
            month_end <- month(ymd(end_date))
    
            query <- paste0(
                "select game_name, month, count(*) as count
       from device_metrics.mtu_events
       where YEAR = '", yr, "' and month between '", month_start, "' and '", month_end, 
                "' group by game_name, month
     order by game_name asc, month asc")
    
            return(query)
        }
    

    Using sprintf

    my_query <- function(start_date, end_date) {
                yr <- year(ymd(start_date))
                month_start <- month(ymd(start_date))
                month_end <- month(ymd(end_date))
    
                query <- sprintf("select game_name, month, count(*) as count
           from device_metrics.mtu_events
           where YEAR = '%d'  and month between '%02d' and '%02d' group by game_name, month
         order by game_name asc, month asc", yr, month_start, month_end)
    
                return(query)
            }