Hello guys I'm working on a interesting real time application.
The application is as follows.I have a meter
model and meter_info
model
calss Meter
has_many :meter_infos
# filed: id
end
class MeterInfo
belongs_to :meter
# field: meter_id, voltage
end
In every two minutes a new data is being saved to meter_info
table.So you can imagine there are a huge data set over there.
Now what I want do is to find out exactly one voltage record of 10 meters
each at a time in 10 minutes interval within 1 day.
So the result would be something like this
id created_at meter_id voltage
2001 2017-10-19 15:40:00 2 100
2001 2017-10-19 15:45:00 1 100
2001 2017-10-19 15:39:00 3 100
2001 2017-10-19 15:48:00 4 100
2001 2017-10-19 15:38:00 5 100
2001 2017-10-19 15:42:00 6 100
...
...
I've tried several queries but as it's taking too much time to find out the record, the request gets timeouted. Here is what I have tried for
(('2017-07-02 00:00:00').to_datetime.to_i ..
('2017-07-02 23:59:59').to_datetime.to_i).step(10.minutes) do |date|
query = "SELECT created_at, meter_id, voltage
FROM meter_infos
WHERE created_at between '#{Time.at(date).utc}' and
'#{Time.at(date).utc + 10.minutes}'
AND meter_id in (1,2,3,4,5)
ORDER BY id desc limit 1"
voltages = ActiveRecord::Base.connection.execute(query)
end
Which is timeouted even in the development environment.
Then I've tried to use Postgresql
's generated_series
like below
query= "SELECT meter_id,voltage, count(id) as ids
, GENERATE_SERIES( timestamp without time zone '2017-10-19',
timestamp without time zone '2017-10-19',
'10 min') as time_range
from meter_infos
where meter_infos.created_at between '2017-10-19 00:00:01'::timestamp and '2017-10-19 23:59:59'::timestamp
and meter_infos.meter_id in (1,2,3,4,5)
GROUP BY meter_id, voltage
ORDER BY meter_id ASC limit 1"
sbps_plot = ActiveRecord::Base.connection.execute(query)
Which is faster but gives me wrong data.
I am using Ruby on Rails
and Postgresql
.
Can somebody help me to write the faster query to find out data against time or suggest me any procedure to handle time series data analysis.
Thanks in advance.
You have records every two minutes, but you want to get a sample record from ten minute intervals. Here's my suggested solution:
You can take the modulus of the epoch time of the created_at
timestamp with 600 (ten minutes in seconds). Then compare this against some 'tolerance' value (e.g. 119 seconds or less) in case the timestamps of your records aren't aligned to perfect ten minute intervals. Think of it of retrieving the first record with a created_at
inside a 2 minute window following each 10 minute interval of the day.
For example,
MeterInfo
.where(
meter_id: [1, 2, 3, 4, 5],
created_at: your_date.beginning_of_day..your_date.end_of_day
)
.where("(cast(extract(epoch from created_at) as integer) % 600) < 119")
Give that a try and see if it works for you.