Search code examples
rreplacepurrrstringr

How can I extract some information from variable labels using map and str_remove_all in R


I have a data frame of labelled variables imported using the haven package in R. For a subset of variables I want to make use of a part of the variable label. I have a good regex that will work, but I don't understand why the combination of map and str_remove_all is not working here.

#random variables
var1<-sample(seq(1,10,1), size=10, replace=T)
var2<-sample(seq(1,10,1), size=10, replace=T)
#Assign variable labels
library(labelled)
var_label(var1)<-"A long variable label - Some Info"
var_label(var2)<-"Another long variable label - Some Other Info"
#Make dataframe
df<-data.frame(var1, var2)
#Confirm variable labels
var_label(df)
#Try to remove relevant string from each
df %>% 
  var_label() %>% 
#Remove everything but what is desired
  map(., str_remove_all(., ".+ - "))

The out put is just NULL.

What is wrong with using map here. The input is a list and then I provide a function. So what is going on?


Solution

  • The second argument of map() must be a function or a formula. So either one of these two works:

    df %>% 
      var_label() %>% 
      map(., \(x) str_remove_all(x, ".+ - "))
    
    df %>% 
      var_label() %>% 
      map(., ~str_remove_all(., ".+ - "))
    

    The documentation of map() prefers the first version:

    A formula, e.g. ~ .x + 1. You must use .x to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.