Search code examples
rvariablespiperefer

How to refer to a tibble column, using a variable name, in a pipe (R)


I am pretty new to R, so this question may be a bit naive.

I have got a tibble with several columns, and I want to create a factor (Bin) by binning the values in one of the columns in N bins. Which is done in a pipe. However, I would like to be able to define the column to be binned at the top of the script (e.g. bin2use = RT), because I want this to be flexible.

I've tried several ways of referring to a column name using this variable, but I cannot get it to work. Amongst others I have tried get(), eval(), [[]]

simplified example code

Subject <- c(rep(1,100), rep(2,100))
RT <- runif(200, 300, 800 )
data_st <- tibble(Subject, RT)

bin2use = 'RT'
nbin = 5

binned_data <- data_st %>%
  group_by(Subject) %>%
  mutate(
    Bin = cut_number(get(bin2use), nbin, label = F)
  )

Error in mutate_impl(.data, dots) : 
  non-numeric argument to binary operator

Solution

  • We can use a non-standard evaluation with `lazyeval

    library(dplyr)
    library(ggplot2)
    f1 <- function(colName, bin){
         call <- lazyeval::interp(~cut_number(a, b, label = FALSE),
                            a = as.name(colName), b = bin)
         data_st %>%
               group_by(Subject) %>% 
               mutate_(.dots = setNames(list(call), "Bin"))
    } 
    
    f1(bin2use, nbin)
    #Source: local data frame [200 x 3]
    #Groups: Subject [2]
    
    #   Subject       RT   Bin
    #     <dbl>    <dbl> <int>
    #1        1 752.2066     5
    #2        1 353.0410     1
    #3        1 676.5617     4
    #4        1 493.0052     2
    #5        1 532.2157     3
    #6        1 467.5940     2
    #7        1 791.6643     5
    #8        1 333.1583     1
    #9        1 342.5786     1
    #10       1 637.8601     4
    # ... with 190 more rows