Search code examples
rstring-matching

R Creating a new variable name by partially matching other variable names


I would like to create a new variable x if there is an a or A in any of the other variable names with the following sample:

structure(list(A = 10L, a = 20L, abc = 1L), .Names = c("A", "a", "abc"), 
class = "data.frame", row.names = c(NA, -1L))

The result should look like this:

structure(list(A = 10L, a = 20L, abc = 1L, x = 31L), .Names = c("A", "a", 
"abc", "x"), class = "data.frame", row.names = c(NA, -1L))

I attempted to accomplish this via:

names1$x[grep("a" | "A", colnames(names1))]

The following error resulted:

"Error in "a" | "A" : operations are possible only for numeric, logical or complex types"

I also tried to only match a but this resulted in NULL.


Solution

  • We could use which and grepl to get the indices which match the "A|a" requirement.

    In my code, I've added in an element b, just to show that b is not summed over. I've copied your code apart from that.

    input <- structure(list(A = 10L, a = 20L, abc = 1L, b = 3L), .Names = c("A", "a", "abc", "b"), class = "data.frame", row.names = c(NA, -1L))
    
    > input
       A  a abc b
    1 10 20   1 3
    

    Then we can do:

    input$x <- sum(input[which(grepl("a|A",names(input)))])
    
    > input
       A  a abc b  x
    1 10 20   1 3 31
    

    Note how b is not summed over.

    Hope this is what you were after!