Search code examples
rdbplyr

syntax for database.table in dbplyr?


I have a connection to our database:

con <- dbConnect(odbc::odbc(), "myHive")

I know this is successful because when I run it, in the top right of RStudio I can see all of our databases and tables.

My question is, how can I select a specific database table combination? The documentation shows a user sleecting a single table, "flights" but I need to do the equivilent of somedatabase.sometable.

Tried:

mytable <- tbl(con, "somedb.sometable")
    Error in new_result(connection@ptr, statement) : 
      nanodbc/nanodbc.cpp:1344: 42S02: [Hortonworks][SQLEngine] (31740) Table or view not found: HIVE..dp_enterprise.uds_order 

Then tried:

mytable <- tbl(con, "somedb::sometable")

Error in new_result(connection@ptr, statement) : 
  nanodbc/nanodbc.cpp:1344: 42S02: [Hortonworks][SQLEngine] (31740) Table or view not found: HIVE..somedb::sometable

I tried removing the quotes "" too.

Within the connections pane of RStudio I can see somedb.sometable. It's there! How can I save it to variable mytable?


Solution

  • You select the database when creating the connection and the table when creating the tbl (with the from argument).

    There is no standard interface to dbConnect, so the exact way to pass the database name depends on the DBDriver you use. Indeed DBI::dbConnect is simply a generic dispatching to the driver-specific dbConnect.

    In your case, the driver is odbc so you can check out the documentation for odbc::dbConnect and you'll see the relevant argument is database.

    This will work:

    con <- dbConnect(odbc::odbc(), "myHive", database = "somedb")
    df <- tbl(con, from = "sometable")
    

    With most other drivers (e.g. RMariaDB, RMySQL, RPostgres, RSQLite), the argument is called dbname, so you'd do this:

    con <- dbConnect(RMariaDB::MariaDB(), dbname = "somedb")
    df <- tbl(con, from = "sometable")