Search code examples
rdplyrtidyevalnon-standard-evaluation

mutate_ deprecated, easy and understandable alternatives?


I'm trying to create a function that creates a variable. Like this:

Add_Extreme_Variable <- function(dataframe, variable, variable_name){
    dataframe %>%
    group_by(cod_station, year_station) %>%
    mutate(variable_name= ifelse(variable > quantile(variable, 0.95, na.rm=TRUE),1,0)) %>%
    ungroup() %>%
    return()
}

df <- Add_Extreme_Variable (df, rain, extreme_rain)

df is the dataframe I'm working with, rain is a numeric variable in df, and extreme_rain is the name of the variable I want to create.

If I use mutate_() everything works well, but the problem it's deprecated. However, the solutions I have found in stackoverflow (1, 2, 3) and the vignette doesn't seem to fit my problem or it seems far more complicated than I need it to be, as I cannot find good examples about how to work with quo(), !! without space, !! with space, how to replace = for :=, and I don't know if working with them at all will solve the problem I have or it's even necessary as the ultimate goal doing this function is to make the code cleaner. Any suggestions?


Solution

  • You can use {{ }} (curly curly). See Tidy evaluation section in Hadley Wickham's Advanced R book. Below is an example using the gapminder dataset.

    library(gapminder)
    library(rlang)
    library(tidyverse)
    
    Add_Extreme_Variable2 <- function(dataframe, group_by_var1, group_by_var2, variable, variable_name) {
      res <- dataframe %>%
        group_by({{group_by_var1}}, {{group_by_var2}}) %>%
        mutate({{variable_name}} := ifelse({{variable}} > quantile({{variable}}, 0.95, na.rm = TRUE), 1, 0)) %>%
        ungroup()
      return(res)
    }
    
    df <- Add_Extreme_Variable2(gapminder, continent, year, pop, pop_extreme) %>% 
      arrange(desc(pop_extreme))
    df
    #> # A tibble: 1,704 x 7
    #>    country   continent  year lifeExp      pop gdpPercap pop_extreme
    #>    <fct>     <fct>     <int>   <dbl>    <int>     <dbl>       <dbl>
    #>  1 Australia Oceania    1952    69.1  8691212    10040.           1
    #>  2 Australia Oceania    1957    70.3  9712569    10950.           1
    #>  3 Australia Oceania    1962    70.9 10794968    12217.           1
    #>  4 Australia Oceania    1967    71.1 11872264    14526.           1
    #>  5 Australia Oceania    1972    71.9 13177000    16789.           1
    #>  6 Australia Oceania    1977    73.5 14074100    18334.           1
    #>  7 Australia Oceania    1982    74.7 15184200    19477.           1
    #>  8 Australia Oceania    1987    76.3 16257249    21889.           1
    #>  9 Australia Oceania    1992    77.6 17481977    23425.           1
    #> 10 Australia Oceania    1997    78.8 18565243    26998.           1
    #> # ... with 1,694 more rows
    
    summary(df)
    #>         country        continent        year         lifeExp     
    #>  Afghanistan:  12   Africa  :624   Min.   :1952   Min.   :23.60  
    #>  Albania    :  12   Americas:300   1st Qu.:1966   1st Qu.:48.20  
    #>  Algeria    :  12   Asia    :396   Median :1980   Median :60.71  
    #>  Angola     :  12   Europe  :360   Mean   :1980   Mean   :59.47  
    #>  Argentina  :  12   Oceania : 24   3rd Qu.:1993   3rd Qu.:70.85  
    #>  Australia  :  12                  Max.   :2007   Max.   :82.60  
    #>  (Other)    :1632                                                
    #>       pop              gdpPercap         pop_extreme     
    #>  Min.   :6.001e+04   Min.   :   241.2   Min.   :0.00000  
    #>  1st Qu.:2.794e+06   1st Qu.:  1202.1   1st Qu.:0.00000  
    #>  Median :7.024e+06   Median :  3531.8   Median :0.00000  
    #>  Mean   :2.960e+07   Mean   :  7215.3   Mean   :0.07042  
    #>  3rd Qu.:1.959e+07   3rd Qu.:  9325.5   3rd Qu.:0.00000  
    #>  Max.   :1.319e+09   Max.   :113523.1   Max.   :1.00000  
    #> 
    

    Created on 2019-11-10 by the reprex package (v0.3.0)