I am trying to do advanced analytic queries to drive a web application. I am using Hapi, Objection, Knex, and Postgres with TimescaleDB. It all is working well for typical relational queries. However, I cannot figure out how to perform this aggregation query that involves joining with an anonymous table that is generated from Postgres's generate_series
. I have had to resort to writing the raw SQL, rather than the Objection/Knex query builder. I am using a few of Postgres's built in functions, along with time_bucket
from Timescale. time_bucket
essentially creates a roll up of the data based on the interval specified in the argument. Check this link out for more information about what I'm attempt to do Gap Filling.
Here's the query, which works using the raw method on the Objection Model. I believe doing string interpolation like this will result in potential SQL injection. However, I was hoping to get this converted into the query builder methods that Objection/Knex use so it's more JavaScript, rather than SQL, which would solve the SQL injection issue.
let errorHistorgram = await Errors
.raw(`SELECT period AS daily, coalesce(count,0) AS count
FROM generate_series(date '${startTS}', date '${today}', interval '1d') AS period
LEFT JOIN (
SELECT time_bucket('1d',timestamp)::date AS date, count(timestamp)
FROM my_error_table
WHERE severity = 'HIGH'
AND timestamp >= '${startTS}' AND timestamp < '${today}'
AND device_id = ${deviceId}
GROUP BY date
) t ON t.date = period;`)
.debug();
I have made several attempts at this with Objection/Knex. This was my most successful attempt at crafting this query. However, I believe that the where clause is not in the correct place.
let errorHistorgram = await Errors
.query()
.select(raw(`time_bucket('1 day', timestamp) AS daily, count(timestamp)`))
.where('device_id', deviceId)
.andWhere('timestamp', '>', startTS)
.andWhere('severity', 'HIGH')
.leftJoin(`generate_series(date ${startTS}, date ${today}, interval 1d) AS series`, 'series.date', 'my_error_table.timestamp')
.debug();
With .debug()
, I'm able to see the output of the query, which is posted below.
select time_bucket('1 day', timestamp) AS daily, count(timestamp)
from my_error_table
left join "generate_series(date 2018-11-08T15:35:33"."050Z, date 2018-11-15T15:35:33"."133Z, interval 1d)" as "series"
on "series"."date" = my_error_table."timestamp"
where "device_id" = ? and "timestamp" > ? and "severity" = ?'
Any assistance is appreciated, as I have not used Objection to do this, and cannot find any documentation on it.
I got it to execute the query with Objection. However, I get an empty array as the result. Unlike with the raw SQL query that I crafted above (which does give me the expected results), I just get an empty array as output for the query builder. Any idea as to what I'm doing wrong. I've tried to flip the join to a right join with no luck.
let errorHistorgram = await Errors
.query()
.select(raw(`time_bucket('1 day', timestamp) AS daily, count(timestamp)`))
.where('device_id', deviceId)
.andWhere('timestamp', '>', startTS)
.andWhere('severity', 'HIGH')
.groupBy('timestamp')
.rightJoin(raw(`generate_series(date '${startTS}', date '${today}', interval '1d') AS series`), 'series.date', 'my_error_table.timestamp')
.debug();
Attached is the SQL output from Debug.
select time_bucket('1 day', timestamp) AS daily, count(timestamp)
from my_errors_table
right join generate_series(date '2018-11-08', date '2018-11-15', interval '1d') AS series
on series = my_errors_table.timestamp
where device_id = ? and timestamp > ? and severity = ?
group by timestamp
Timescale released a new feature called Time Bucket Gapfill. It made this much easier because you no longer have to do a left join with the generate_series
to gap fill.
I have included an example of how to implement this with an ObjectionJS model called Errors
. The inputs to time_bucket_gapfill
function are bucket size, timestamp column name, startTS, and endTS. The bucket size variable should be a string with ""
(not single quotes) that corresponds to the bucket size (eg: "10 seconds"
, "30 minutes"
, "1 hour"
, "1 day"
). The startTS
and stopTS
should be ISO Date Strings. The second select statement requires COALESCE
so that it will output 0 if there is a bucket generated where there is no data contained in the bucket. The group by
is required for the bucketing to correctly summarize the data based on the aggregate SQL function you supply in the select
statement.
import { raw } from 'objection';
const errors = await Errors
.query()
.select(
raw("time_bucket_gapfill(?, timestamp, ?, ?) AS bucket", [bucketWidth, startTS, endTS]),
raw('COALESCE(count(timestamp), 0) AS count'),
).where('device_id', deviceId)
.andWhere('timestamp', '>=', startTS)
.andWhere('timestamp', '<=', endTS)
.groupBy('bucket')
.orderBy('bucket');