Search code examples
rselectdplyr

dplyr: select all variables except for those contained in vector


This should be a simple issue but I am struggling.

I have a vector of variable names that I want to exclude from a data frame:

df <- data.frame(matrix(rexp(50), nrow = 10, ncol = 5))
names(df) <- paste0(rep("variable_", 5), 1:5)

excluded_vars <- c("variable_1", "variable_3")

I would have thought that just excluding the object in the select statement with - would have worked:

select(df, -excluded_vars)

But I get the following error:

Error in -excluded_vars : invalid argument to unary operator

the same is true when using select_()

Any ideas?


Solution

  • select(df, -any_of(excluded_vars)) is now the safest way to do this (the code will not break if a variable name that doesn't exist in df is included in excluded_vars)