I have two lists with four data frames each. The data frames in both lists ("loc_list_future" and "loc_list_2019) have 33 columns: "Year" and then mean precipitation values for 32 different climate models.
The data frames in loc_list_future look like this but with 32 Model columns total and the data goes to Year 2059:
Year Model 1 Model 2 Model 3 ...Model 32
2020 714.1101 686.5888 1048.4274
2021 1018.0095 766.9161 514.2700
2022 756.7066 902.2542 906.2877
2023 906.9675 919.5234 647.6630
2024 767.4008 861.1275 700.2612
2025 876.1538 738.8370 664.3342
2026 781.5092 801.2387 743.8965
2027 876.3522 819.4323 675.3022
2028 626.9468 927.0774 696.1884
2029 752.4084 824.7682 835.1566
...
2059
The data frames in loc_list_2019 have years ranging from 2006-2019 but otherwise look the same.
Each data frame represents a geographic location, and the two lists have the same four locations but one list is for 2006-2019 values and the other is for future values.
I would like to run two-sample t-tests that compare the 2006-19 values with the future values for each model at each location.
I have another list (loc_list_OBS) that has dataframes with only two columns "Year" and "Mean_Precip" (this is observed data not based off of models which is why there is only one column for mean precip). I have code (see below) that will run two-sample t-tests for the observed data (loc_list_OBS) against the future data (loc_list_future), but I am unsure how I can change this code to run t-tests for the two lists that have 32 models each.
myfun <- function(x,y)
{
OBS_Data <- x$Mean_Precip
#Empty list
List <- list()
#Now loop
for(i in 2:dim(y)[2])
{
#Label
val <- names(y[,i,drop=F])
Future_Data <- y[,i]
#Test
test <- t.test(OBS_Data, Future_Data, alternative = "two.sided")
#Save
List[[i-1]] <- test
names(List)[i-1] <- val
}
return(List)
}
t.stat <- mapply(FUN = myfun,x=loc_list_OBS,y=loc_list_future, SIMPLIFY = FALSE)
I would suggest next approach. I have created dummy data similar to what you have. Here the code:
#Data before
dfb <- structure(list(Year = 2010:2019, Model.1 = c(614.1101, 918.0095,
656.7066, 806.9675, 667.4008, 776.1538, 681.5092, 776.3522, 526.9468,
652.4084), Model.2 = c(586.5888, 666.9161, 802.2542, 819.5234,
761.1275, 638.837, 701.2387, 719.4323, 827.0774, 724.7682), Model.3 = c(948.4274,
414.27, 806.2877, 547.663, 600.2612, 564.3342, 643.8965, 575.3022,
596.1884, 735.1566)), class = "data.frame", row.names = c(NA,
-10L))
#Data after
dfa <- structure(list(Year = 2020:2029, Model.1 = c(714.1101, 1018.0095,
756.7066, 906.9675, 767.4008, 876.1538, 781.5092, 876.3522, 626.9468,
752.4084), Model.2 = c(686.5888, 766.9161, 902.2542, 919.5234,
861.1275, 738.837, 801.2387, 819.4323, 927.0774, 824.7682), Model.3 = c(1048.4274,
514.27, 906.2877, 647.663, 700.2612, 664.3342, 743.8965, 675.3022,
696.1884, 835.1566)), class = "data.frame", row.names = c(NA,
-10L))
Now the code:
#Data for lists
L.before <- list(df1=dfb,df2=dfb,df3=dfb,df4=dfb)
L.after <- list(df1=dfa,df2=dfa,df3=dfa,df4=dfa)
The function:
#Function
myfun <- function(x,y)
{
#Create empty list
List <- list()
#Loop
for(i in 2:dim(x)[2])
{
name <- names(x[,i,drop=F])
before <- x[,i]
after <- y[,i]
#Test
test <- t.test(before, after, alternative = "two.sided")
#Save
List[[i-1]] <- test
names(List)[i-1] <- name
}
return(List)
}
The application:
#Apply
t.stat <- mapply(FUN = myfun,x=L.before,y=L.after, SIMPLIFY = FALSE)
Some outputs:
t.stat[[1]]
$Model.1
Welch Two Sample t-test
data: before and after
t = -1.9966, df = 18, p-value = 0.06122
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-205.224021 5.224021
sample estimates:
mean of x mean of y
707.6565 807.6565
$Model.2
Welch Two Sample t-test
data: before and after
t = -2.8054, df = 18, p-value = 0.0117
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-174.88934 -25.11066
sample estimates:
mean of x mean of y
724.7764 824.7764
$Model.3
Welch Two Sample t-test
data: before and after
t = -1.4829, df = 18, p-value = 0.1554
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-241.67613 41.67613
sample estimates:
mean of x mean of y
643.1787 743.1787
Let me know if that works for you!