I am now processing time series like data which is in the following shape:
It has three columns say t_1, t_2, att
. And t_1
and t_2
are ordered observations of time and att
is numerical value.
Toy example of data:
t_1 t_2 att
12:30:32 12:33:12 1
12:30:55 12:33:43 3
12:31:21 12:34:34 2
The object I want to build is stick to the following rule:
If t_1
is "continuous" then I build a time series object with t_1
as time Index, att
and t_2
as value.
If t_1
is not "continuous" and t_2
is "continuous" then I build a
time series object with t_2
as time index, t_1
and att
as value
If t_1
and t_2
both not continuous, then report message back and
build nothing
define interval< 1 hour, say, as continuous
An example of non-continuous t_1
but continuous t_2
:
t_1 t_2 att
12:30:32 12:33:12 1
12:30:55 12:33:43 3
14:31:21 12:34:34 2
14:33:24 12:35:34 -12
Any ideas for implementation either in python or R will be super welcome. The data will be imported in as dataframe, either pandas dataframe or R dataframe.
Time series object like xts
or ts
in R
Hopefully, you can build time series objects
from the tuples built from this:
import itertools as it
import datetime
data = [['12:30:32', '12:33:12', 1],
['12:30:55', '12:33:43', 3],
['14:31:21', '12:34:34', 2],
['14:33:24', '12:35:34', -12]]
def continuous(series, time_format = '%H:%M:%S', criteria = 3600):
'''Returns True if time series is continuous.
series -- sequence of strings
time_format -- str (default '%H:%M:%S')
criteria -- int (default 3600)
'''
# make datetime objects
t = [datetime.datetime.strptime(thing, time_format) for thing in series]
# find the deltas
t2 = (two - one for one, two in it.izip(t, t[1:]))
# apply the criteria
return all(item.seconds <= criteria for item in t2)
# extract the time series data
one, two, values = zip(*data)
if continuous(one):
# make tuples - (t1, (t2, att))
time_series_data = [(t1, (t2, att)) for t1, t2, att in it.izip(one, two, values)]
elif continuous(two):
# make tuples - (t2, (t1, att))
time_series_data = [(t2, (t1, att)) for t1, t2, att in it.izip(one, two, values)]
else:
raise ValueError('No Continuous Data')