Search code examples
rsql-serverodbcdatabase-schemadbplyr

tbl with in_schema returns "Invalid object name" error


After a connection to the SQL server, the databases inside it can be listed.

con = dbConnect(odbc(),
             Driver   = "ODBC Driver 17 for SQL Server",
             Server   = "xxxxxxxxxxxx",
             UID = "xxxxxxxxxxxx",
             PWD = "xxxxxxxxxxxx",
             Port     = xxxxxxxxxxxx)

Here you can find a successful connection.

enter image description here

After, I just would like to list the databases within this SQL server

databases = dbGetQuery(con, "SELECT name FROM master..sysdatabases")

enter image description here

Since I am not familiar with the SQL, It is a little bit strange for me to see that there is an already assigned Database which is "DB01CWE5462" within "con". This database can also be found within the result of dbGetQuery (DB01CWE5462). I guess that this database is automatically assigned to the con.

However, I would like to export the yellow highlighted table which is seen above. The below code was successful before (one month ago), but now it returns an error.

tbl(con, in_schema("DB01WEA84103.dbo","Ad10Min1_Average"))

Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 'DB01WEA84103.dbo.Ad10Min1_Average'. [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Statement(s) could not be prepared. 'SELECT * FROM "DB01WEA84103.dbo"."Ad10Min1_Average" AS "q13" WHERE (0 = 1)'

After a little search, I found a solution that is quite slow compared with the above codes' previous successful runs.

dbReadTable(con, 'Ad10Min1_Average', schema='DB01WEA84103.dbo')

So, what is the thing that I am missing? What should I do for the con and in_schema code which produces an error to work again?


Solution

  • The difference in speed is because tbl(con, ...) is creating an access point to a remote table, while dbReadTable(con, ...) is reading/copying the table from SQL into R.

    The approach you were using has been the standard work-around for specifying both database and schema. I would guess there has been an update to the dbplyr package that means this work-around now requires an additional step.

    Taking a close look at the SQL from the error message reveals the cause:

    SELECT * FROM "DB01WEA84103.dbo"."Ad10Min1_Average"
    

    Note the double quotes around "DB01WEA84103.dbo". The double quotes tell SQL to treat this as a single object: a schema with name DB01WEA84103.dbo, instead of two objects: a database with name DB01WEA84103 and a schema with name dbo.

    Ideally this query would read:

    SELECT * FROM "DB01WEA84103"."dbo"."Ad10Min1_Average"
    

    Now the full stop is not included in the double quotes.

    Reading the dbplyr documentation (link) for in_schema is specifies that the names of schema and table "... will be automatically quoted; use sql() to pass a raw name that won’t get quoted."

    Hence I recommend you try:

    tbl(con, in_schema(sql("DB01WEA84103.dbo"),"Ad10Min1_Average"))
    

    Notes:

    • Double quotes in SQL are used to indicate a single object, ignoring special characters. Square brackets are often used in SQL for the same purpose.
    • Whether you use single or double quotes in R does not affect whether or not the SQL code will contain double quotes. This is controlled by dbplyr's translation methods.
    • If your database name contains special characters then try enclosing them in square brackets instead: For example [my!odd@database#name].[my%unusual&schema*name].