Search code examples
rtime-seriesintervalsposixct

Subset datatime series into 1h intervals


I have a data frame (dim: 589) with POSIXct class values.

   df <- read.table(header = T, stringsAsFactors = F, text = "   id time                     par_surface
           1   2014-07-19 07:10:00       907.6
           2   2014-07-19 07:11:00       956.2
           3   2014-07-19 07:12:00       963.4
           4   2014-07-19 07:14:00       957.6
           5   2014-07-19 07:15:00       876.8
           6   2014-07-19 07:16:00       883.6
           7   2014-07-19 07:18:00       903.8
           8   2014-07-19 07:18:59       817.4
           9   2014-07-19 07:19:59       880.8
           10  2014-07-19 07:21:59       877.6
           11  2014-07-19 07:22:59       960.0
           12  2014-07-19 07:24:00       977.8
           13  2014-07-19 07:26:00       964.0
           14  2014-07-19 07:27:00       995.0
           15  2014-07-19 07:28:00      1053.8
           16  2014-07-19 07:29:59      1024.4
           17  2014-07-19 07:30:59       916.0
           18  2014-07-19 07:31:59      1042.6
           19  2014-07-19 07:34:00      1047.4
           20  2014-07-19 07:35:00      1022.8
           21  2014-07-19 07:36:00      1023.8
           22  2014-07-19 07:38:00       993.2
           23  2014-07-19 07:39:00      1009.4
           24  2014-07-19 07:39:59       950.0
           25  2014-07-19 07:42:00       986.2
           26  2014-07-19 07:43:00       971.0
           27  2014-07-19 07:44:00       879.6
           28  2014-07-19 07:46:00       841.6
           29  2014-07-19 07:47:00       928.8
           30  2014-07-19 07:47:59      1000.8
           31  2014-07-19 07:50:00      1027.8
           32  2014-07-19 07:51:00       977.2
           33  2014-07-19 07:51:59      1040.4
           34  2014-07-19 07:54:00      1049.4
           35  2014-07-19 07:54:59      1131.6
           36  2014-07-19 07:55:59      1186.2
           37  2014-07-19 07:58:00      1171.0
           38  2014-07-19 07:58:59      1168.8
           39  2014-07-19 08:00:00      1093.8
           40  2014-07-19 08:02:00      1204.8
           41  2014-07-19 08:03:00      1214.8
           42  2014-07-19 08:03:59      1224.2
           43  2014-07-19 08:05:59      1217.2
           44  2014-07-19 08:06:59      1239.2
           45  2014-07-19 08:08:00      1196.2
           46  2014-07-19 08:10:00      1203.8
           47  2014-07-19 08:10:59      1211.8
           48  2014-07-19 08:12:00      1167.2
           49  2014-07-19 08:13:59      1163.2
           50  2014-07-19 08:15:00      1179.6
           51  2014-07-19 08:16:00      1218.2
           52  2014-07-19 08:18:00      1245.4")

Now I need to subset this into hourly intervals. It's important that the first value will be retained

     time                      par_surface
 1   2014-07-19 07:10:00       907.6
 2   2014-07-19 08:10:00       1203.8
 ...

I tried split(), cut() and plyr::ddply() to do so, but it doesn't work.


Solution

  • I assume that you will take the mean of the values within one hour

    df = aggregate(list(col2=df$col2),by=list(timestamp=cut(as.POSIXct(df$timestamp),"hour")),mean)
    

    Here col2 refers to name of 2 column and timestamp refers to the name of first column