Below I create a function that deletes a specific column if there is only one unique value in it. Can I somehow use lapply within %>% to avoid calling the function three times? Or even call the function for all columns?
df <- tibble(col1 = sample(1:6), col2 = sample(1:6), col3 = 3, col4 = 4)
condDelCol <- function(mycolumn, mydataframe) {
if(length(unique(mydataframe[[mycolumn]])) == 1) { mydataframe[[mycolumn]] = NULL }
mydataframe
}
df %>%
condDelCol("col2", .) %>%
condDelCol("col3", .) %>%
condDelCol("col4", .)
With dplyr
, an option is select_if
library(dplyr)
df %>%
select_if(~ n_distinct(.) > 1)
# A tibble: 6 x 2
# col1 col2
# <int> <int>
#1 1 6
#2 6 1
#3 5 5
#4 3 4
#5 4 2
#6 2 3
Or another way is base R
by looping over the columns with sapply
, create a logical vector
, extract the column names that have only single unique
value and assign (<-
) it to NULL
i1 <- sapply(df, function(x) length(unique(x)))
df[names(which(i1 == 1))] <- NULL
Or with Filter
Filter(var, df)