Search code examples
roracle-databasedplyrodbcdbplyr

dbplyr Oracle alias limit database


After a lengthy troubleshooting process I finally got RStudio to connect to my works Oracle database using the instant client. The motivation to do this was to use dbplyr to do most of my querying. However, I am unable to send any queries through dbplyr as all the column and table names are very long and result in ORA-00972: identifier is too long.

Example Error

 test <- tbl(con, "long_shcema_name.very_long_and_descriptive_name") %>%
   select(a_column)
Error: nanodbc/nanodbc.cpp:1617: 42000: [Oracle][ODBC][Ora]ORA-00972: identifier is too long

<SQL> 'SELECT *
FROM ("long_shcema_name.very_long_and_descriptive_name") "zzz7"
WHERE (0 = 1)'

I thought the issue was driver related. My work uses Oracle 12.2 which has a larger cap for identifiers, but dbGetQuery works fine over instantclient

Working example without dbplyr

dbGetQuery(con,'
  select "a_column"
  from 
  long_shcema_name.very_long_and_descriptive_name')
a_column
    1
    0
    2
    .
    .
    .

This gives me the desired results.

Does a solution exist?

I'm a little irked. It took me almost three weeks to get the odbc connection set up with my works database using the instant client. I was able to query using RODBC and ROracle from the get go, but wanted to use dplyr. It ended up not working and I don't have a whole lot of juice left to figure out why. Any help would be immensely appreciated at this point.


Solution

  • The issue is most likely not long file names (unless your table names are longer than 30 characters, see here) but the use of a schema. Oracle can use double quotes to identify database, schema and table names (ref). Square brackets are sometimes used for this purpose too (ref).

    Take a look at the SQL that produces the error message:

    SELECT *
    FROM ("long_schema_name.very_long_and_descriptive_name") "zzz7"
    WHERE (0 = 1)
    

    This is most likely being interpreted as:

    • database = {default}
    • schema = {empty}
    • table = long_schema_name.very_long_and_descriptive_name

    Instead of as:

    • database = {default}
    • schema = long_schema_name
    • table = very_long_and_descriptive_name

    As @Wil comments you can address this with the in_schema function. If you try:

     test <- tbl(con, in_schema("long_schema_name","very_long_and_descriptive_name") %>%
       select(a_column)
    

    Then show_query(test) should return:

    SELECT *
    FROM ("long_schema_name"."very_long_and_descriptive_name") "zzz7"
    WHERE (0 = 1)
    

    Note how this differs from the first query by the addition of double quotes either side of the full stop.