I just want to calculate a daily mean from a set of values taken periodically throughout the day, but for a number of different days in a data set. tapply() is great, when my date is a factor
> Data$Data <- as.factor(Data$Date)
> str(Data$Date)
Factor w/ 55 levels "01/05/2014","02/05/2014",..: 3 3 3 3 3 3 3 3 3 3 ...
> tapply(Data$Humidity,Data$Date, FUN = mean)
01/05/2014 02/05/2014 03/04/2014 03/05/2014 04/04/2014 04/05/2014 05/04/2014 05/05/2014 06/04/2014
99.96875 100.00000 96.65833 99.80625 84.14375 89.56042 93.75833 39.58750 87.55000
This gives me exactly what I want but these dates are no longer in chronological order as I have done it as a factor.
Instead I have tried using strptime() as a recognised date format by R. Starting again from the beginning....
> Data$Date<-strptime(Data$Date, format="%d/%m/%Y")
> str(Data$Date)
POSIXlt[1:2586], format: "2014-04-03" "2014-04-03" "2014-04-03" "2014-04-03" "2014-04-03" "2014-04-03" ...
> tapply(Data$Humidity,Data$Date, FUN = mean)
Error in INDEX[[i]] : subscript out of bounds
But I just get the following error message? Does anyone know why this isn't working?
I also found that I can simply alter the tapply() output to strptime() afterwards, via a dataframe() rather than trying to do it before, then order() by date
Data$Date <- as.factor(Data$Date)
DAVEH <- tapply(Data$Humidity,Data$Date, FUN = mean)
site.daily<-data.frame(c(names(DAVEH)),c(DAVEH))
rownames(site.daily)<-seq_len(nrow(site.daily))
colnames(site.daily)<-c("Date","DAVEH")
site.daily$Date<-strptime(site.daily$Date, format="%d/%m/%Y")
site.daily<-site.daily[order(site.daily$Date),]
rownames(site.daily)<-seq_len(nrow(site.daily)) # again as they have been re-ordered
> site.daily
Date DAVEH
1 2014-04-03 96.65833
2 2014-04-04 84.14375
3 2014-04-05 93.75833
4 2014-04-06 87.55000
5 2014-04-07 58.87708
6 2014-04-08 99.83542
7 2014-04-09 87.68125.....
and so on.