Search code examples
rdataframemapply

Creating multiple new variables using Vectors, existing variables, and mapply in R


I am pretty new to R and am attempting to create a new columns/variables in my data set, df, using information from multiple columns which already exist in my data set. I was hoping to use the mapply function to carry this out. This is data which is referring to certain measurements taken on the right side of someone and also on the left. Only one of these sides is affected however and is defined by df$laterality. Ultimately, I would like to create new variable/columns which defines the data collected from the measurements as data collected from the affected side.

My data, simplified, essentially looks like the following

recordID <- c(1, 2, 3, 4)
laterality <- c(right, right, left, right)
right_1_measure <- c(2.3, 3.4, 1.7, 2.4)
right_2_measure <- c(1.3, 2.2, 3.1, 4.1)
right_3_measure <- c(2.7, 2.8, 4.2, 3.9)
left_1_measure <- c(1.5, 2.6, 4.5, 2.8)
left_2_measure <- c(1.1, 3.4, 3.5, 2.6)
left_3_measure <- c (2.6, 2.8, 3.6, 1.6)

df <- data.frame(recordID, laterality, right_1_measure, right_2_measure, right_3_measure, left_1_measure, left_2_measure, left_3_measure)

I then created a vector of the column names I wished to cycle through to make the new " affected" variable/columns, which I would name in accordance to the previously defined variables but add the prefix "aff". I also created a vector of the names I hoped to give the new columns.

right_vars <- c("right_1_measure", "right_2_measure" , "right_3_measure")
left_vars <- c("left_1_measure", "left_2_measure" , "left_3_measure")
aff_vars <- c("aff_1_measure", "aff_2_measure", "aff_3_measure")

I then created the function which I was planning to use to conditionally create the new columns based on df$laterality

aff_var_create <- function (x, y, z){
  df$x <- ifelse(df$laterality == "Right" , df$y, ifelse (df$laterality == "Left", df$z, NA))
}

Then I created my mapply code

mapply(FUN = aff_var_create, x = aff_vars, y = r_vars, z = l_vars)

However, when I run this I receive the following error message:

Error in ans[ypos] <- rep(yes, length.out = len)[ypos] : 
  replacement has length zero
In addition: Warning message:
In rep(yes, length.out = len) :
 Error in ans[ypos] <- rep(yes, length.out = len)[ypos] : 
  replacement has length zero 

Ive checked my data frame and all columns have data in them, so I am confused as to why the y.pos has zero length.

Ultimately, I would like my data frame to look like the following

recordID <- c(1, 2, 3, 4)
laterality <- c(right, right, left, right)
right_1_measure <- c(2.3, 3.4, 1.7, 2.4)
right_2_measure <- c(1.3, 2.2, 3.1, 4.1)
right_3_measure <- c(2.7, 2.8, 4.2, 3.9)
left_1_measure <- c(1.5, 2.6, 4.5, 2.8)
left_2_measure <- c(1.1, 3.4, 3.5, 2.6)
left_3_measure <- c (2.6, 2.8, 3.6, 1.6)
aff_1_measure <- c(2.3, 3.4, 4.5, 2.4)
aff_2_measure <- c(1.3, 2.2, 3.5, 4.1)
aff_3_measure <- c(2.7, 2.8, 3.6, 3.9)

df <- data.frame(recordID, laterality, right_1_measure, right_2_measure, right_3_measure, left_1_measure, left_2_measure, left_3_measure, aff_1_measure, aff_2_measure, aff_3_measure)

Any suggestions to fixing this issue or using another method to achieve a similar result would be much appreciated! Thank you.


Solution

  • You cannot dynamically pass string value with $ notation. Instead use [[. Also, since mapply does not update data frame in place, you need to assign results to columns:

    right_vars <- c("right_1_measure", "right_2_measure" , "right_3_measure")
    left_vars <- c("left_1_measure", "left_2_measure" , "left_3_measure")
    aff_vars <- c("aff_1_measure", "aff_2_measure", "aff_3_measure")
    
    aff_var_create <- function(x, y, z){
      ifelse(df$laterality == "right" , df[[y]], ifelse(df$laterality == "left", df[[z]], NA))
    }
    
    df[aff_vars] <- mapply(FUN=aff_var_create, x=aff_vars, y=right_vars, z=left_vars)
    
    df
    

    Alternatively, assign by indexing with [.

    aff_cols <- paste0("aff_", 1:3, "_measure")
    right_cols <- paste0("right_", 1:3, "_measure")
    left_cols <- paste0("left_", 1:3, "_measure")
    curr_logic <- df$laterality == "right"
    
    # INITIALIZE COLUMNS
    df[aff_cols] <- NA
    
    # UPDATE COLUMNS BY INDEX
    df[curr_logic , aff_cols] <- df[curr_logic , right_cols]
    df[!curr_logic , aff_cols] <- df[!curr_logic, left_cols]
    
    df
    

    Even better, use a single ifelse call since it can run vector and matrix comparison aligning to same dimensions (hence, replicate).

    aff_cols <- paste0("aff_", 1:3, "_measure")
    right_cols <- paste0("right_", 1:3, "_measure")
    left_cols <- paste0("left_", 1:3, "_measure")
    curr_logic <- df$laterality == "right"
    
    df[aff_cols] <- ifelse(replicate(3, curr_logic), 
                           as.matrix(df[right_cols]), 
                           as.matrix(df[left_cols]))
    
    df