Search code examples
sqlrdbplyr

Is R dbplyr "smart" enough to determine the SQL database type and apply appropriate syntax?


library(tidyverse)
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
copy_to(con, mtcars)
mtcars2 <- tbl(con, "mtcars")

As I'm starting to learn SQL my first big lesson is that the syntax will be different dependent on database type.

For example, this type of query will often work:

mtcars3 <- DBI::dbGetQuery(con, "SELECT * FROM mtcars LIMIT 5")

But on some SQL databases I try SELECT * FROM xyz LIMIT 5 and I get a syntax error. I then try something along the lines of:

DBI::dbGetQuery(con, "SELECT TOP 5 * FROM xyz")

and I'm able to get the result I want.

This makes me very curious as to what will happen when I start using dbplyr exclusively and forego using SQL queries at all (to the extent possible). Is dbplyr going to be "smart" enough to recognize the different SQL databases I'm working in? And more importantly will dbplyr apply the correct syntax, dependent on database type?


Solution

  • Yes, dbplyr is 'smart' enough to use the connect type of the table when translating dplyr commands to database (SQL) syntax. Consider the following example:

    library(dplyr)
    library(dbplyr)
    data(mtcars)
    
    # mimic postgre database
    df_postgre = tbl_lazy(mtcars, con = simulate_postgres())
    df_postgre %>% head(5) %>% show_query()
    
    # resulting SQL translation
    <SQL>
    SELECT *
    FROM `df`
    LIMIT 5
    
    # mimic MS server database
    df_server = tbl_lazy(mtcars, con = simulate_mssql())
    df_server %>% head(5) %>% show_query()
    
    # resulting SQL translation
    <SQL>
    SELECT TOP(5) *
    FROM `df`
    

    You can experiment with the different simulate_* functions in dbplyr to check translations for your particular database.