I'm trying to find a quick way to convert a binary numerical variable into a factor using dplyr.
I have a dataset with this structure:
library(dplyr)
f<-as_tibble(data.frame(col1=c(1,1,0),col2=c("ham","spam","spam"),col3=c(1,2,8),col4=c(1,0,0)))
For now, I have tried using n_distinct
g<-f %>% select_if(is.numeric) %>% sapply(n_distinct)
But I don't know how to proceed by filtering out only those columns with n_distinct == 2
. To be clear, my final output should be:
names(g[g==2])
[1] "col1" "col4"
Any idea? Thank you
How about using select_if
and define a function that check if the column is numeric as well as if the number of distint values is exactly 2. Try:
f %>%
select_if(~n_distinct(.) == 2 & is.numeric(.)) %>%
names()
Which gives you:
[1] "col1" "col4"