Search code examples
filterdplyrsapply

filter on named vector in dplyr (R)


I'm trying to find a quick way to convert a binary numerical variable into a factor using dplyr.

I have a dataset with this structure:

library(dplyr)
f<-as_tibble(data.frame(col1=c(1,1,0),col2=c("ham","spam","spam"),col3=c(1,2,8),col4=c(1,0,0)))

For now, I have tried using n_distinct

g<-f %>% select_if(is.numeric) %>% sapply(n_distinct) 

But I don't know how to proceed by filtering out only those columns with n_distinct == 2. To be clear, my final output should be:

names(g[g==2])

[1] "col1" "col4"

Any idea? Thank you


Solution

  • How about using select_if and define a function that check if the column is numeric as well as if the number of distint values is exactly 2. Try:

    f %>% 
      select_if(~n_distinct(.) == 2 & is.numeric(.)) %>% 
      names()
    

    Which gives you:

    [1] "col1" "col4"