I have a list containing 981 data frames. Each data.frame has the same structure.
I want to lag one column (called growth) to calculate the growth over time (from one observation to another) for each data frame.
I tried lapply by somehow could not get it done.
my_list <-
list(
data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2)),
data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2)),
data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2))
)
If you are not able to share real data you can create a fake dataset to make the post reproducible.
If I have understood you correctly here is what you can do with lapply
lapply(list_df, function(x) {x$difference <- c(NA, diff(x$growth)); x})
#[[1]]
# growth b difference
#1 3 a NA
#2 8 b 5
#3 4 c -4
#4 7 d 3
#5 6 e -1
#6 1 f -5
#7 10 g 9
#8 9 h -1
#9 2 i -7
#10 5 j 3
#[[2]]
# growth b difference
#1 10 a NA
#2 5 b -5
#3 6 c 1
#4 9 d 3
#5 1 e -8
#6 7 f 6
#7 8 g 1
#8 4 h -4
#9 3 i -1
#10 2 j -1
The tidyverse
way to do the same would be
library(dplyr)
library(purrr)
map(list_df,. %>% mutate(difference = c(NA, diff(growth))))
OR
map(list_df,. %>% mutate(difference = growth - lag(growth)))
data
set.seed(123)
list_df <- list(data.frame(growth = sample(10), b = letters[1:10]),
data.frame(growth = sample(10), b = letters[1:10]))