Search code examples
rapplymagrittr

Use apply functions within %>%


Below I create a function that deletes a specific column if there is only one unique value in it. Can I somehow use lapply within %>% to avoid calling the function three times? Or even call the function for all columns?

df <- tibble(col1 = sample(1:6), col2 = sample(1:6), col3 = 3, col4 = 4)

condDelCol <- function(mycolumn, mydataframe) {    
    if(length(unique(mydataframe[[mycolumn]])) == 1) { mydataframe[[mycolumn]] = NULL }
    mydataframe
}

df %>% 
condDelCol("col2", .) %>% 
condDelCol("col3", .) %>% 
condDelCol("col4", .)

Solution

  • With dplyr, an option is select_if

    library(dplyr) 
    df %>%
       select_if(~ n_distinct(.) > 1)
    # A tibble: 6 x 2
    #   col1  col2
    #  <int> <int>
    #1     1     6
    #2     6     1
    #3     5     5
    #4     3     4
    #5     4     2
    #6     2     3
    

    Or another way is base R by looping over the columns with sapply, create a logical vector, extract the column names that have only single unique value and assign (<-) it to NULL

    i1 <-  sapply(df, function(x) length(unique(x)))
    df[names(which(i1 == 1))] <- NULL
    

    Or with Filter

    Filter(var, df)