I'm pretty new to R. I want to use a function that executes multiple calculations, these calculate across columns in a data frame and creates new columns that hold the final calculations. I want to implement this function across a list of data frames, however, when I try to use lapply I receive an error that states the first column name is missing with no default.
I know this must be an issue with how I am formatting my function, however I am struggling to come up with a solution for this. How can I proceed?
#create example data frames, my real data frames are named similarly, with an identical names and a unique id (i.e. example_df_uniqueidnumber), each data frame has columns named identically
df1 <- data.frame(pt1_X = c(1,2,3), pt2_X = c(1,2,3), pt1_Y = c(1,2,3), pt2_Y =c(1,2,3))
df2 <- data.frame(pt1_X = c(1,2,3), pt2_X = c(1,2,3), pt1_Y = c(1,2,3), pt2_Y =c(1,2,3))
#create my example function
#NOTE: I call the data "data" (instead of df1 or df2), because I am unsure of what to use instead, as each file name is different due to the unique identifier
calculate_angles1 <- function(data, pt1_X, pt1_Y, pt2_X, pt2_Y) {
data$Mx <- (data[[pt1_X]] - data[[pt2_X]])
data$My <- (data[[pt1_Y]] - data[[pt2_Y]])
return(data)
}
#create my list of data frames
new_list <- list(df1, df2)
#use lapply to attempt to apply my function to my list of data frames
AoA <- lapply(new_list, calculate_angles1)
After I run my lapply function, I receive this error message..
Error in (function(x, i, exact) if (is.matrix(i)) as.matrix(x)[[i]] else .subset2(x, :
argument "pt1_X" is missing, with no default
The issue with your function is that the name must be surrounded by double quotes. Also to treat the variable values use a single [
instead of double [[
.
So the function could be rewritten as:
calculate_angles1 <- function(data) {
data["Mx"] <- data["pt1_X"] - data["pt2_X"]
data["My"] <- data["pt1_Y"] - data["pt2_Y"]
data.frame(data)
}
To apply your function to list of dataframes there are various ways lapply
as you mentioned:
lapply
new_list <- lapply(new_list, calculate_angles1)
or using map()
function from purrr
package or tidyverse
family and I think this would be more straightforward. As your function is Data frame function
it take a data frame as the first argument and returns data frame. So, you can call dplyr
verbs to manipulate your data inside map
as I did here, i.e calling mutate()
from dplyr
to create new variables.
map
library(tidyverse)
new_list <- map(new_list, ~ mutate(.x, Mx=pt1_X-pt2_X, My=pt1_Y-pt2_Y))
Both of these options produce same output:
> new_list
[[1]]
pt1_X pt2_X pt1_Y pt2_Y Mx My
1 1 1 1 1 0 0
2 2 2 2 2 0 0
3 3 3 3 3 0 0
[[2]]
pt1_X pt2_X pt1_Y pt2_Y Mx My
1 1 1 1 1 0 0
2 2 2 2 2 0 0
3 3 3 3 3 0 0