I'm getting the following error message when I try some code on a distant Postgres
database.
The following peusdo-reproducing code works well when dataframes are local but not when they are distant.
library(tidyverse)
library(dbplyr)
library(RPostgres)
event <- tibble(id = c("00_1", "00_2", "00_3", "00_4", "00_5", "00_6", "00_7"),
type_id = c("A", "B", "C", "B", "A", "B", "C"))
detail <- tibble(id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L),
event_id = c("00_1", "00_1", "00_2", "00_2", "00_3", "00_4", "00_4", "00_5", "00_6", "00_6", "00_7", "00_8"),
type_id = c(3L, 4L, 6L, 7L, 2L, 6L, 3L, 2L, 6L, 5L, 2L, 1L))
event_f <- event %>%
mutate(new_variable = id %in% (detail %>%
filter(type_id == 6) %>%
pull(event_id))) %>%
collect()
Error in UseMethod("escape") : no applicable method for 'escape' applied to an object of class "c('tbl_PqConnection', 'tbl_dbi', 'tbl_sql', 'tbl_lazy', 'tbl')"
The problem is most likely caused by the nested dplyr query. There are two possibilities here:
dbplyr can not translate your query into SQL,
dbplyr translates your query but it is not valid SQL.
dbplyr attempts to translate each set of commands into SQL. One way to verify this is to use the function show_query()
.
For example the R command:
my_table %>% mutate(new_col = 2 * old_col + id) %>% select(new_col, id) %>% show_query()
Will return something like the following SQL command:
SELECT 2 * old_col + id AS new_col, id
FROM database.my_table
This will only happen if translation of R to SQL has been possible. Therefore:
If show_query
returns SQL, you should review the SQL to determine where it is wrong and adapt your R commands to correct this
If show_query
does not return, or gives an error then dbplyr has not be able to translate your query, and it needs to be restructured.
Anticipating that the problem is caused by the nested dplyr commands (detail %>% filter %>% pull)
, I recommend replacing this by a semi_join as follows:
detail_f <- detail %>%
filter(type_id == 6)
event_f <- event %>%
semi_join(detail_f, by = c("id" = "type_id")) %>%
collect()
If you are unfamiliar with semi-joins you might find this post helpful. R also supports the use of anti-joins.
Edit: misread your initial query.
As you want to add an indicator for the presence/absence of event_id
in your output table, you can probably avoid semi- or anti-joins. Perhaps something like the following:
detail_f <- detail %>%
filter(type_id == 6) %>%
select(id_to_compare = event_id) %>%
mutate(new_variable = 1)
event_f <- event %>%
left_join(detail_f, by = c("id" = "id_to_compare")) %>%
mutate(new_variable = ifelse(is.na(new_variable), 0, new_variable) %>%
collect()
Note, I have used 0 & 1 here instead of FALSE
& TRUE
as some versions of SQL don't handle these as easily as R does.