Search code examples
rloopsglobal-variablesapplyr-factor

loop through a vector of variable names while dropping factor levels


I have a character vector containing names of structurally identical variables in .GlobalEnv. I want to drop levels from the same factor for each variable.

How do I, e.g. expanding on the below code, actually update the variables in the global environment? Looking at the example data, it would mean to drop levels from factor z which do not exist for W and Q after having subset and assign those to the .GlobalEnv.

lapply(mget(var_list, .GlobalEnv), function(x) levels(x$z) <- droplevels(x$z))

I thought I could do something similar to the below example for changing e.g. rownames using lapply

list_of_dfs <- lapply(list_of_dfs, "rownames<-", NULL)

dummy data

z <- c("a",'a','b','a','c','d','b','d','d','b')
x <- c(10,2,33,41,5,64,17,11,22,1)
y <- c('W','W','W','W','W','Q','Q','Q','Q','Q')
df <- data.frame(cbind(z,x,y))

var_list <- c('W', 'Q')

# subsetting and assigning variables
for(i in var_list){
   assign(i, subset(df, y %in% i), envir=.GlobalEnv)
}

Solution

  • What you are doing in this line

    lapply(mget(var_list, .GlobalEnv), function(x) levels(x$z) <- droplevels(x$z))
    

    is that you modify the local variable x, not the variables Q and W in the .GlobalEnv.

    You can try something like:

    lapply(var_list, function(x) assign(x, droplevels(get(x, envir=.GlobalEnv)), .GlobalEnv))
    

    This drops the unused levels from all columns, not only z.

    To drop levels only from z, you can try:

    lapply(var_list, function(x) 
           eval(parse(text=paste0(x, "['z'] <<- droplevels(", x,"['z'])"))))