Search code examples
sqlrdbplyr

Problem using dbplyr to create a SQL query


I am trying to create a query that cleans a data in SQL but I don't know SQL. To do so, I create a sample of the dataset and use dplyr to clean and create the final data that I want and then I want to use dbplyr to create the query. But I am having trouble to create the query

data <- read_csv("DataBase/raw/bquxjob_1016f9df_180d80c269e.csv")

#Creating query I will run in SQL
data %>% 
  filter(codigo_natureza_juridica == 2046 
         | codigo_natureza_juridica  ==  2054 
         | codigo_natureza_juridica == 2062
         | codigo_natureza_juridica == 2070
         | codigo_natureza_juridica == 2089
         | codigo_natureza_juridica == 2097 
         | codigo_natureza_juridica == 2135
         | codigo_natureza_juridica == 2232
         | codigo_natureza_juridica == 2240
         | codigo_natureza_juridica == 2259
         | codigo_natureza_juridica == 2267 &
           data_inicio_atividade <= '2011-12-31') %>%
  mutate(year_open             = format(data_inicio_atividade, format ="%Y"),
         month_open            = format(data_inicio_atividade, format ="%m"),
         date_open             = format(data_inicio_atividade, format ="%Y%m"),
         year_close            = ifelse(situacao_cadastral != 2, 
                                        format(data_situacao_cadastral, format ="%Y"),
                                        NA),
         month_close           = ifelse(situacao_cadastral != 2, 
                                        format(data_situacao_cadastral, format ="%m"),
                                        NA),
         date_close           = ifelse(situacao_cadastral != 2, 
                                        format(data_situacao_cadastral, format ="%Y%m"),
                                        NA),
         open_2007              = ifelse(year_open <= 2007, 1,0)) %>%
  show_query()


This code gives the following error:

Error in UseMethod("show_query") : 
  no applicable method for 'show_query' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"

Solution

  • show_query() will only work on a database, and you are trying to use it on a dataframe. To send your data from the csv to a temporary database object to create the query, you could use tbl_memdb() and instead do:

    data %>% 
      tbl_memdb() %>%
      filter(...) %>%
      mutate(...) %>%
      show_query()