Search code examples
roracle-databasedplyrdbplyr

Mutate new character col based on another character col in oracle table using R dbplyr


I have a oracle table with a col COMPLAINT_REASON

complaints_tbl %>% head() %>% select(COMPLAINT_REASON)

# Source:   lazy query [?? x 1]
# Database: Oracle 12.01.0020[user@user_db/]
  COMPLAINT_REASON         
  <chr>                    
1 Payment Related          
2 Bill Related          
3 Order Management          
4 Repair/Connection related
5 Broadband
6 Product fault   

I am trying to create a new col called primary_reason but with different values i.e. if the COMPLAINT_REASON = Payment Related then the primary_reason should have Payments. If none are matched then have the value as is in the primary_reason column.

In normal circumstance, I would do something like this using data.table:

complaints_tbl <- complaints_tbl[,primary_reason := forcats::fct_recode(COMPLAINT_REASON,
    "Payments"    = "Payment Related",
    "Billing"    = "Bill Related",
    "Orders"    = "Order Management",
    "Billing"    = "Billing/Payment Enquiry")]

As you can see the ones that are not available will come under primary reason as is. (Product fault, Broadband, Repair/Connection related) and Payment Related becomes Payments etc. in the primary_reason.

I have tried:

complaints_tbl %>% mutate(primary_reason = forcats::fct_recode(COMPLAINT_REASON
    "Payments"    = "Payment Related",
    "Billing"    = "Bill Related",
    "Orders"    = "Order Management",
    "Billing"    = "Billing/Payment Enquiry"))

But getting the error:

Error in check_factor(.f) : object 'COMPLAINT_REASON' not found

Lastly, it will be ideal to push the new col back to my existing table in oracle for future use.

Any pointers? Cheers


Solution

  • I work in SQL server not oracle, but I think the challenge here is ensuring dbplyr can translate your commands to the database language regardless of your choice of language.

    In general dbplyr struggles to translate commands outside the dplyr or tidyverse collection. Hence why forcats::fct_recode is not working for you.

    An example solution using ifelse, which gets translated correctly in my environment:

    complaints_tbl %>%
      # create column for ease of changes
      mutate(primary_reason = NA) %>%
      # one mutate per match/rename
      mutate(primary_reason = ifelse(COMPLAINT_REASON = "Payment Related",
                                     yes = "Payments", no = primary_reason)) %>%
      mutate(primary_reason = ifelse(COMPLALINT_REASON = "Billing Related",
                                     yes = "Billing", no = primary_reason)) %>%
      # if none are matched
      mutate(primary_reason = ifelse(is.na(primary_reason),
                                     yes = COMPLAINT_REASON, no = primary_reason))
    

    Instead of multiple mutates with ifelse, you could instead use case_when.