Search code examples
rstringextractstringrstrsplit

Extracting vector name from df$vector in custom function with str_split() in R


have been trying to programme a custom function that outputs the variable name as a string from an input object x that is a specific vector from a dataframe, i.e. in the form of df$vector , so that it function like this

function(iris$Species)

>"Species"

Currently I am doing this:

vector.name<-function(x){
  require(stringr)
  
  #convert df$variable into string
  xname <- as.character(deparse(substitute(x)))
  
  if (str_detect(xname,"$")==T) {
    str_split(xname,"$")
  }
}  

but the results are unsatisfying

> vector.name(iris$Species)
[[1]]
[1] "iris$Species" ""           

I have tried both strsplit(){base} and str_split(){stringr}, they both work normally for other ordinary alphabetic strings, e.g.

> str_split(as.character(deparse(substitute(iris$Species))),"S")
[[1]]
[1] "iris$"  "pecies"

How do I extract "vector" from df$vector in a custom function then?


Solution

  • The $ is a metacharacter to match the end of string. Either escape (\\$) or wrap it inside square bracket ([$]) or use fixed to evaluate the character literally

    vector.name<-function(x){
     
     xname <- as.character(deparse(substitute(x)))
     if(stringr::str_detect(xname,fixed("$"))) {
        stringr::str_split(xname, fixed("$"))
      }
     
    } 
    

    -testing

    vector.name(iris$Species)
    [[1]]
    [1] "iris"    "Species"
    

    Note that $ in the first str_detect returns TRUE and it is just a coincidence and nothing else i.e. $ by itself looks for the end of string and it matches in all the strings whether it is a blank or not

    > str_detect("iris$Species", "$")
    [1] TRUE
    > str_detect("", "$")
    [1] TRUE
    

    Instead, it would be

    > str_detect("iris$Species", "\\$")
    [1] TRUE
    > str_detect("", "\\$")
    [1] FALSE
    

    Similarly for the str_split, as it matches the end of the string, it returns the second element as blank

    > str_split("iris$Species", "$")
    [[1]]
    [1] "iris$Species" ""