Search code examples
rdplyrdbplyr

Error in UseMethod("escape") : no applicable method for 'escape' applied to an object of class


I'm getting the following error message when I try some code on a distant Postgres database.

The following peusdo-reproducing code works well when dataframes are local but not when they are distant.

library(tidyverse)
library(dbplyr)
library(RPostgres)


event <- tibble(id = c("00_1", "00_2", "00_3", "00_4", "00_5", "00_6", "00_7"),
               type_id = c("A", "B", "C", "B", "A", "B", "C"))


detail <- tibble(id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L),
                event_id = c("00_1", "00_1", "00_2", "00_2", "00_3", "00_4", "00_4", "00_5", "00_6", "00_6", "00_7", "00_8"),
                type_id = c(3L, 4L, 6L, 7L, 2L, 6L, 3L, 2L, 6L, 5L, 2L, 1L))


event_f <- event %>%
 mutate(new_variable = id %in% (detail %>%
          filter(type_id == 6) %>%
          pull(event_id))) %>%
 collect()

Error in UseMethod("escape") : no applicable method for 'escape' applied to an object of class "c('tbl_PqConnection', 'tbl_dbi', 'tbl_sql', 'tbl_lazy', 'tbl')"


Solution

  • The problem is most likely caused by the nested dplyr query. There are two possibilities here:

    1. dbplyr can not translate your query into SQL,

    2. dbplyr translates your query but it is not valid SQL.

    How to tell

    dbplyr attempts to translate each set of commands into SQL. One way to verify this is to use the function show_query().

    For example the R command:

    my_table %>% mutate(new_col = 2 * old_col + id) %>% select(new_col, id) %>% show_query()
    

    Will return something like the following SQL command:

    SELECT 2 * old_col + id AS new_col, id
    FROM database.my_table
    

    This will only happen if translation of R to SQL has been possible. Therefore:

    1. If show_query returns SQL, you should review the SQL to determine where it is wrong and adapt your R commands to correct this

    2. If show_query does not return, or gives an error then dbplyr has not be able to translate your query, and it needs to be restructured.

    Anticipating that the problem is caused by the nested dplyr commands (detail %>% filter %>% pull), I recommend replacing this by a semi_join as follows:

    detail_f <- detail %>%
        filter(type_id == 6)
    
    event_f <- event %>%
        semi_join(detail_f, by = c("id" = "type_id")) %>%
        collect()
    

    If you are unfamiliar with semi-joins you might find this post helpful. R also supports the use of anti-joins.

    Edit: misread your initial query.

    As you want to add an indicator for the presence/absence of event_id in your output table, you can probably avoid semi- or anti-joins. Perhaps something like the following:

    detail_f <- detail %>%
        filter(type_id == 6) %>%
        select(id_to_compare = event_id) %>%
        mutate(new_variable = 1)
    
    event_f <- event %>%
        left_join(detail_f, by = c("id" = "id_to_compare")) %>%
        mutate(new_variable = ifelse(is.na(new_variable), 0, new_variable) %>%
        collect()
    

    Note, I have used 0 & 1 here instead of FALSE & TRUE as some versions of SQL don't handle these as easily as R does.