I have a list ("x") of files that have temperatures every ten minutes. Each ID is a separate entry in the list and has a group of associated temperatures. Here's a sample:
head(x)
$165.212
ID Date Time DateTime Temp
165.212 2023-07-18 10:31:00 2023-07-18 10:31:00 20.37
164.212 2023-07-18 10:41:00 2023-07-18 11:35:00 23.4
164.212 2023-07-18 10:51:00 2023-07-18 11:35:00 23.8
Lux Fx Sex Sp TimeOfDay Hour
3060 164.212 M WT 10.58416667 2024-07-18 11:00:00
1287 164.212 M WT 10.75083333 2024-07-18 11:00:00
1128 164.212 M WT 10.91750000 2024-07-18 11:00:00
$164.314
ID Date Time DateTime Temp
164.314 2023-07-18 10:31:00 2023-07-18 11:35:00 32.5
164.314 2023-07-18 10:41:00 2023-07-18 11:35:00 33.2
164.314 2023-07-18 10:51:00 2023-07-18 11:35:00 22.8
Lux Fx Sex Sp TimeOfDay Hour
3060 164.314 M WT 10.58416667 2024-07-18 11:00:00
1287 164.314 M WT 10.75083333 2024-07-18 11:00:00
2700 164.314 M WT 10.91750000 2024-07-18 11:00:00
I want to create a for loop that takes each entry in the list and averages hourly and daily temperatures for that ID number, preferably a new list with the ID, average hourly temperatures and date/ time.
I have created a column ("hour") that rounds the hour to the nearest hour, and I would like to aggregate the data into groups based on these values.
Here is the code I'm using but I can't figure out why it's not working. I know I need to specify that I'm making a new list.
for (each in x) {
hour_t$ = x %>%
group_by(Hour) %>%
summarise(AvgTemperature = mean(Temp, na.rm = TRUE))
}
Example data (best guess):
list(`165.212` = structure(list(ID = c("165.212", "164.212",
"164.212"), Date = structure(c(19556, 19556, 19556), class = "Date"),
Time = c("10:31:00", "10:41:00", "10:51:00"), DateTime = structure(c(1689676260,
1689680100, 1689680100), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Temp = c(20.37, 23.4, 23.8), Lux = c(3060L, 1287L, 1128L),
Fx = c(164.212, 164.212, 164.212), Sex = c("M", "M", "M"),
Sp = c("WT", "WT", "WT"), TimeOfDay = c("10.58416667 2024-07-18",
"10.75083333 2024-07-18", "10.91750000 2024-07-18"), Hour = c("11:00:00",
"11:00:00", "11:00:00")), row.names = c(NA, -3L), class = "data.frame"),
`164.314` = structure(list(ID = c("164.314", "164.314", "164.314"
), Date = structure(c(19556, 19556, 19556), class = "Date"),
Time = c("10:31:00", "10:41:00", "10:51:00"), DateTime = structure(c(1689680100,
1689680100, 1689680100), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), Temp = c(32.5, 33.2, 22.8), Lux = c(3060L,
1287L, 2700L), Fx = c(164.314, 164.314, 164.314), Sex = c("M",
"M", "M"), Sp = c("WT", "WT", "WT"), TimeOfDay = c("10.58416667 2024-07-18",
"10.75083333 2024-07-18", "10.91750000 2024-07-18"),
Hour = c("11:00:00", "11:00:00", "11:00:00")), row.names = c(NA,
-3L), class = "data.frame"))
This can be easily done in a single-line. Use aggregate
from base and loop over your list with lapply
:
> lapply(data_list, aggregate, Temp~ID+Date+Hour, mean)
$`165.212`
Date Hour ID Temp
1 2023-07-18 11:00:00 164.212 23.60
2 2023-07-18 11:00:00 165.212 20.37
$`164.314`
Date Hour ID Temp
1 2023-07-18 11:00:00 164.314 29.5
and do
lapply(unname(data_list), aggregate, Temp ~ Date + Hour + ID, mean) |>
do.call(what=rbind) |> split(f=~ID)
if you like to separate by ID
.