I have multiple datasets of 1 second time resolution real data. This data will often have gaps in the time-series where the instrument dropped data, or when the instrument was turned off, resulting in a patchy (albeit still very useful) dataset. The resulting data might look like the following
Timestamp [timestamp] : datastream1 [double precision] : datastream2 [double precision] : etc
2011-01-01 00:00:01 153.256 1255.325
2011-01-01 00:00:02 152.954 1254.288
2011-01-01 00:00:03 151.738 1248.951
2011-01-01 00:00:04 150.015 1249.185
2011-01-01 00:10:08 179.132 1328.115
2011-01-01 00:10:09 178.051 1323.125
2011-01-01 00:10:10 180.870 1336.983
2011-01-04 09:19:02 152.198 1462.814
2011-01-04 09:19:03 158.014 1458.122
2011-01-04 09:19:04 156.070 1464.174
Please note: this data are generally continuous but will have random gaps which must be dealt with.
I need to write code to take the average and stdev of a given time interval, "timeInt", that is able to deal with these gaps. For example, if I wanted a 10 min average of data, my required output would be:
Timestamp_10min : avg_data1 : med_data1 : count_data1
where avg_data1 would be the average of all the data points within a given 10 minute period, and count_data1 would be the number of points used in the calculation of that average (i.e. 600 if there was no missing data, 300 if every second point is missing, etc etc).
This code needs to work with any desired input interval (i.e. x minutes, y days, z weeks, months, years, etc).
Currently I am only able output minute averages using the following code.
CREATE OR REPLACE VIEW "DATATABLE_MIN" AS
SELECT MIN("DATATABLE"."Timestamp") AS "Timestamp_min",
avg("DATATABLE"."datastream1") AS "datastream1_avg_min",
stddev("DATATABLE"."datastream1") AS "datastream1_stdev_min",
count("DATATABLE"."datastream1") AS "datastream1_avg_min"
FROM "DATATABLE"
GROUP BY to_char("DATATABLE"."Timestamp",'YYYY-MM-DD HH24:MI'::text);
Thanks in advance for the help!
To group by 10 minutes, you can do this by using the "epoch":
SELECT MIN(dt."Timestamp") AS "Timestamp_min",
avg(dt."datastream1") AS "datastream1_avg_min",
stddev(dt."datastream1") AS "datastream1_stdev_min",
count(dt."datastream1") AS "datastream1_avg_min"
FROM "DATATABLE" dt
GROUP BY trunc(extract(epoch from dt."TimeStamp") / (60*10));
This is the number of seconds since a fixed time in the past. If you divide it by 600, you get the number of 10 minute intervals -- what you need for the aggregation.