Search code examples
sqlpostgresqltime-seriescountertimescaledb

TimescaleDB - Counters


My goal is to query the delta of a counter metric over time using a continuous aggreate from TimescaleDB.

I'm following the example from Running a counter aggregate query with a continuous aggregate

I created the table:

CREATE TABLE example (
    measure_id      BIGINT,
    ts              TIMESTAMPTZ ,
    val             DOUBLE PRECISION,
    PRIMARY KEY (measure_id, ts)
);

Inserted some values for testing:

INSERT INTO example VALUES 
    (1, '2022-05-07 10:00', 0),
    (1, '2022-05-07 10:05', 5),
    (1, '2022-05-07 10:10', 10),
    (1, '2022-05-07 10:15', 15),
    (1, '2022-05-07 10:20', 20),
    (1, '2022-05-07 10:25', 25);

And then run the query that puts this data into time buckets and uses the delta accessor function to calculate the difference of the measurements over time:

SELECT measure_id,
    time_bucket('15 min'::interval, ts) as bucket,
    delta(counter_agg(ts, val, toolkit_experimental.time_bucket_range('15 min'::interval, ts)))
FROM example
GROUP BY measure_id, time_bucket('15 min'::interval, ts);

But the results don't match my expections:

measure_id bucket delta
1 2022-05-07 10:00:00+01 10
1 2022-05-07 10:15:00+01 10

The delta accessor works correctly as it substracts the last value of each time bucket from the first

10 - 0 = 10
and
25 - 15 = 10

but the overall delta is not correct: the metric grew from 0 to 25 which is equal to 25 and not 10 + 10 which is only 20.

How can I get the "proper" delta?


Solution

  • Thanks to the comments from @greenweeds I worked out this solution which seems to

    • account for counter resets and
    • ignore the initial counter value (I just want differences over time and the first value would be treated as such a difference although it is not)

    Hope this helps others as well. My query:

    SELECT
        measure_id,
        time_bucket('15 min', ts) AS bucket,
        SUM(diff_val) AS delta
    FROM(
        SELECT 
            ts, 
            measure_id,
            -- Use 0 on counter resets
            CASE WHEN diff_val >= 0 THEN diff_val
            ELSE 0
            END AS diff_val
        FROM (
            SELECT
                ts,
                measure_id,
                -- Calculate the change in the counter from current to last value
                val - LAG(val,1,0) OVER (
                    ORDER BY ts
                ) AS diff_val
            FROM example
            ORDER BY 1,2
        ) AS base
        -- Remove the first (oldest) data rows - they containt the initial counter values which we ignore as they could be very high and make the final chart unusable
        WHERE ts != (SELECT ts FROM example LIMIT 1)
    ) AS final
    GROUP BY 1,2
    ORDER BY 1,2
    

    I tested this against a more complex data set:

    INSERT INTO example VALUES 
        (1, '2022-05-07 10:00', 1000),
        (1, '2022-05-07 10:05', 1005),
        (1, '2022-05-07 10:10', 1010),
        (1, '2022-05-07 10:12', 0),
        (1, '2022-05-07 10:15', 5),
        (1, '2022-05-07 10:20', 10),
        (1, '2022-05-07 10:31', 15);
    

    And the result is what I expected:

    | measure_id | bucket                 | delta |
    |------------|------------------------|-------|
    | 1          | 2022-05-07 10:00:00+01 |10     |
    | 1          | 2022-05-07 10:15:00+01 |10     |
    | 1          | 2022-05-07 10:30:00+01 |5      |