Search code examples
goinfluxdb

How to select the closest data to the given time for each group


I'm using InfluxDB 1.4, and here's my task

1) find the closet value for each IDs. 
2) Do 1) for every hour

For example,

select id, value, time from myTable where time = '2018-08-14T00:00:00Z' group by id;
select id, value, time from myTable where time = '2018-08-14T01:00:00Z' group by id;
....
select id, value, time from myTable where time = '2018-08-14T23:00:00Z' group by id;

then, some id have value at each o'clock but others don't. In this case, I want to get the closest row to the give time '2018-08-14T14:00:00Z', like as '2018-08-14T14:00:01Z' or '2018-08-14T13:59:59Z'

and I don't want to query 24 times for each hour. Can I do this task with group by time, id, or something else?


Solution

  • Q: I would like to select the point data closest to the hourly boundary. Is there a way I can do this without having to query 24 times for each day? Will group by time be any help on this?

    A:

    Will group by time be any help on this?

    Unfortunately the group by time function will not be much help to you as it requires the query to have an aggregation function. What the group by time function does is that it groups all data that falls within the interval into one single record by using the aggregation function like sum, mean etc to tabulate the combined row's values.

    Is there a way I can do this without having to query 24 times for each day?

    To the best of my knowledge, I don't think influxdb 1.5 has any way to build a one liner query for this task. Maybe there is something in 1.6, i'm not sure. Haven't tried that.

    At the moment I think your best solution today is to build a query that uses the time filter, order by and limit functions e.g.

    select * from uv where time >= '2018-08-18T14:00:00Z' and time < '2018-08-18T15:00:00Z' order by desc limit 1;

    The query above means that you are selecting all the points within 2pm to 3pm and then order them by descending order but only return the first row, which is what you want.

    If for some reason you can only do 1 HTTP request to influxdb for the hourly data on a particular day. You can bundle up the 24 queries into one big query using the ; seperator and retrieve the data in 1 transaction. E.g.

    select * from uv where time >= '2018-08-18T14:00:00Z' and time < '2018-08-18T15:00:00Z' order by desc limit 1; select * from uv where time >= '2018-08-18T15:00:00Z' and time < '2018-08-18T16:00:00Z' order by desc limit 1; select * from uv where time >= '2018-08-18T16:00:00Z' and time < '2018-08-18T17:00:00Z' order by desc limit 1;

    Output:

    name: uv
    time                tag1 id         value
    ----                -------- --         -----
    1534603500000000000 apple  uv 2
    1534607100000000000 apple  uv 1
    1534610700000000000 apple  uv 3.1