Search code examples
rdplyrsurveyquosure

Unquote quosure outside quasiquotation context


I am defining a function to get the predicted values of a regression model with survey data for different subgroups(subpopulations). I use the svyglm function from the survey package.

My problem concerns handling the subset option in the svyglm function. As it uses non-standard evaluation, which I understand means it doesn't take the column names as a string. I tried just using the columns name without strings and enquoting (enquo) and unquoting it (!!). However both options do not work. I also played around with ensym() and expr() but did not get any results.

Data & library

library(dplyr)
library(survey)
library(srvyr)
library(purrr)
library(rlang)

mtcars <- read.table("https://forge.scilab.org/index.php/p/rdataset/source/file/master/csv/datasets/mtcars.csv",
                     sep=",", header=TRUE)

mtcars_cplx <- mtcars %>% as_survey_design(id = cyl, weights = qsec)

carb <- c(1:8)
cyl <- c(4:8)
new_data <- expand.grid(carb, cyl)
colnames(new_data) <- c("carb", "cyl")

With quousure

Function and Input

subpop_pred <- function(formula, data, subpop, new_data) {
  
  subpop_quo <- enquo(subpop)
  subpop_txt <- data$variables %>% select(!!subpop_quo) %>% colnames()
  
  for(i in min(data$variables[subpop_txt]):max(data$variables[subpop_txt])){
    reg <- svyglm(formula, data, subset=!!subpop_quo==i)
    pred <- predict(reg, newdata=new_data)
    
    if(exists("reg_end")==TRUE){
      pred <- cbind(new_data, pred, confint(pred))
      pred[subpop_txt] <- i
      reg_end <- rbind(reg_end, pred)
    } else {
      reg_end <- cbind(new_data, pred, confint(pred))
      reg_end[subpop_txt] <- i
    }
  }
}

subpop_pred(mpg ~ carb + cyl + carb*cyl, 
            data=mtcars_cplx, 
            new_data=new_data,
            subpop=gear)

Output/Error

 Error: Base operators are not defined for quosures.
Do you need to unquote the quosure?

  # Bad:
  myquosure == rhs

  # Good:
  !!myquosure == rhs
Call `rlang::last_error()` to see a backtrace 
8. stop(cnd) 
7. abort(paste_line("Base operators are not defined for quosures.", 
    "Do you need to unquote the quosure?", "", "  # Bad:", bad, 
    "", "  # Good:", good, )) 
6. Ops.quosure(subpop_quo, i) 
5. eval(subset, model.frame(design), parent.frame()) 
4. eval(subset, model.frame(design), parent.frame()) 
3. svyglm.survey.design(formula, data, subset = !!subpop_quo == 
    i) 
2. svyglm(formula, data, subset = !!subpop_quo == i) 
1. subpop_pred(mpg ~ carb + cyl + carb * cyl, data = mtcars_cplx, 
    new_data = new_data, subpop = gear) 

Without quosure

Function and Input

subpop_pred <- function(formula, data, subpop, new_data) {
  
  subpop_quo <- enquo(subpop)
  subpop_txt <- data$variables %>% select(!!subpop_quo) %>% colnames()
  
  for(i in min(data$variables[subpop_txt]):max(data$variables[subpop_txt])){
    reg <- svyglm(formula, data, subset=subpop==i)
    pred <- predict(reg, newdata=new_data)
    
    if(exists("reg_end")==TRUE){
      pred <- cbind(new_data, pred, confint(pred))
      pred[subpop_txt] <- i
      reg_end <- rbind(reg_end, pred)
    } else {
      reg_end <- cbind(new_data, pred, confint(pred))
      reg_end[subpop_txt] <- i
    }
  }
}

subpop_pred(mpg ~ carb + cyl + carb*cyl, data=mtcars_cplx, new_data=new_data, subpop=gear)

Output

Error in eval(subset, model.frame(design), parent.frame()) : 
  object 'gear' not found 
5. eval(subset, model.frame(design), parent.frame()) 
4. eval(subset, model.frame(design), parent.frame()) 
3. svyglm.survey.design(formula, data, subset = subpop == i) 
2. svyglm(formula, data, subset = subpop == i) 
1. subpop_pred(mpg ~ carb + cyl + carb * cyl, data = mtcars_cplx, 
    new_data = new_data, subpop = gear) 

Do you have an idea how to get the function to work?


Solution

  • I could get things working with the subset argument by mixing expr() and rlang::tidy_eval().

    The model line in your function could then read:

    reg <- svyglm(formula, data = data, 
           subset = rlang::eval_tidy( expr( !!subpop_quo == i), data =  data) )
    

    I don't know robust this is, though, or if there is some more straightforward approach to tidyeval. Working on this made me realize that the subset() function/argument are difficult to work with in functions. :-P