Possible to select variables in a dataframe by those matching certain factor levels, selecting columns based on their factor levels (used or unused)? I can summarise by levels or subset possibly by rows, but I wondered if columns could be selected from the dataframe, or at least list variables/columns, that have certain factor levels.
library(dplyr)
height <- c(132,151,162,139,166,147,122)
weight <- c(48,49,66,53,67,52,40)
gender <- c("male","male","female","female","male","female","male")
gender2 <- c("female","male","male","male","male","female","male")
genderx <- c("xfemale","malex","malex","male","male","xfemale","xfemale")
df <- data.frame(height,weight,gender, gender2, genderx) %>%
rowid_to_column(., "ID")
something like (or not like)
%>% select (vars(levels ==(c("male", "female")))
We can use select_if
library(dplyr)
df %>%
select_if(~ is.factor(.) && all(c("male", "female") %in% levels(.)))
Or it can be any
as well
df %>%
select_if(~ is.factor(.) && any(c("male", "female") %in% levels(.)))