Search code examples
rclasslabelattr

Removing S3 labels in a list


I am working with a large dataset downloaded dataset with the end goal of joining many data frames.

For the past week or so, I have been unable to join data frames due to incompatibility of the data types "labelled" vs. "character." Ultimately I would like to map my function to the same variable in multiple data frames in a list.

Structure of each df is as follows (edited to change variables/attr names because I cannot share the data). The variable of interest I'm working with here is "CODE":

structure(list(VAR1 = structure(c(val, val, val, val, val, val), .Label = c("a", 
"b", "c", "d"), class = "factor"), ID = c(1, 
2, 3, 4, 5, 6), CODE = structure(c("c1", "c1", 
"c1", "c1", "c1", "c1"), label = "instance code", units = "-4", class = c("labelled", 
"character")), ...

I'm still relatively new to R/RStudio, so I thought for a while my issue was with mapping throughout the list, but when I pick one element to remove labels, it still doesn't work. It is almost as if R doesn't know that the label is there, despite the fact that when I use get_label, the label shows up (function below).

get_label(my.list[["my.df"]][["my.variable"]]

I have tried the following methods (I am showing it as if I am working with a single variable instead of the whole list, which is how I have been experimenting for the last couple of days):

  1. The class function. Interestingly, when I call this back, it says the class is character; however, when I look at the dataframe, the class still says "character [# of elements] (S3: labelled, character)"
class(my.list[["my.df"]][["my.variable"]] <- "character"
  1. remove_label function
remove_label(my.list[["my.df"]][["my.variable"]]
  1. unclass function. This one worked for one variable at a time, but did not map over the whole list, so I am including my mapping code in case that is the issue in this case.
## for one variable
unclass(my.list[["my.df"]][["my.variable"]])

## for entire list
my.list %>%
map_at("my.variable", ~ unclass)

## I also tried map in case it was a map_at issue--still didn't work.
  1. zap_label
zap_label(my.list[["my.df"]][["my.variable"]])
  1. setting the attribute to null
attr(my.list[["my.df"]][["my.variable"]], "label") <- NULL
  1. as.character
as.character(my.list[["my.df"]][["my.variable"]])

Does anyone have any ideas? Could it be a bug in R, or is it just my relative inexperience with R showing?

I have also tried modifying these functions in case I was misinterpreting the label and it was value labels instead of variable labels causing the issue. It's not!

Thanks for any assistance!


Solution

  • You could use the labelled package which allows to set / remove labels:

    library(labelled)
    
    
    my.df = data.frame(test = "a test")
    labelled::var_label(my.df) <- list(test='a test label')
    
    var_label(my.df$test)
    #> [1] "a test label"
    
    my.list <-list(my.df = my.df)
    
    var_label(my.list[["my.df"]][["test"]])
    #> [1] "a test label"
    
    my.list[["my.df"]][["test"]] <- remove_labels(my.list[["my.df"]][["test"]])
    
    var_label(my.list[["my.df"]][["test"]])
    #> NULL
    
    my.list[["my.df"]][["test"]]
    #> [1] "a test"