Search code examples
rsubsetr-haven

R - Loss labels when I subset a data frame after using read_sav from haven package


I use the read_sav function from haven package to import an SPSS file. Therefore I have column names and associate labels (class labelled).

I lost the labels when I subset the data frame with subset. I can use a workaround with indexing data[i] but is this behavior a bug or not ?

Here is a simple example.

DataForExample <- structure(list(q0001_0001 = structure(c(2, NA, 5, 4, NA), label = "être plus rapide", class = "labelled", labels = structure(c(1, 
2, 3, 4, 5), .Names = c("non, pas du tout", "non, pas vraiment", 
"oui, un peu", "oui, tout à fait", "je ne sais pas"))), q0001_0002 = structure(c(NA, 
3, NA, 4, 2), label = "être plus fiable", class = "labelled", labels = structure(c(1, 
2, 3, 4, 5), .Names = c("non, pas du tout", "non, pas vraiment", 
"oui, un peu", "oui, tout à fait", "je ne sais pas")))), .Names = c("q0001_0001", 
"q0001_0002"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L))

View(DataForExample) # OK
Toto <- subset(DataForExample, select = q0001_0001)
View(Toto) # NOK : the labels disappeared
Toto2 <- DataForExample[1]
View(Toto2) # OK

Thanks


Solution

  • The same answer as with your previous question about sorting. You need to load package with support for subsetting operations for class labelled. It is better to load it after the haven. There are at least two packages with such support: Hmisc and expss. No additional actions are needed, just library(expss) or library(Hmisc).