Search code examples
rlabelmatchhmisc

Replacing column names, with matching column names, that have a label attached


I have data as follows:

library(data.table)
library(haven)
library(Hmisc)
library(foreign)

df <- fread("A B D C E
               1 2 2 3 1")

For some of the variables I want to add column labels Lbs (mainly for Stata use).

Lbl <- structure(list(Varcode = structure(1:3, class = "factor", levels = c("A", 
"B", "C")), Variables = c("Variable label for variable A", "Variable label for variable B", 
"Variable label for variable C")), row.names = c(NA, -3L), class = "data.frame")

Since not all variables have labels, I want to replace the column names in df, that match the column names in Lbl, with the column names in Lbl, that have labels attached.

The following code only works if all variables have labels and they are in order:

# set labels
for (i in seq_len(nrow(Lbl))) {
  Hmisc::label(df[[Lbl$Varcode[i]]]) <- Lbl$Variables[i]
}

enter image description here

How should I add the labels to the correct columns?


Solution

  • For each variable in your df, get the corresponding label in Lbl. If the latter doesn't exist, then move to the next variable.

    library(data.table)
    library(haven)
    library(Hmisc)
    library(foreign)
    
    df <- fread("A B D C E
                 1 2 2 3 1")
    
    Lbl <- structure(list(Varcode = structure(1:3, class = "factor", levels = c("A", 
                                                                                "B", "C")), Variables = c("Variable label for variable A", "Variable label for variable B", 
                                                                                                          "Variable label for variable C")), row.names = c(NA, -3L), class = "data.frame")
    
    
    
    for (i in names(df)) {
      label <- Lbl[Lbl$Varcode == i, 2]
      if (length(label) == 0) next
     
      Hmisc::label(df[[i]]) <- label
    }
    
    df
    #>             A          B     D          C     E
    #>    <labelled> <labelled> <int> <labelled> <int>
    #> 1:          1          2     2          3     1
    

    Created on 2022-11-11 with reprex v2.0.2

    enter image description here