Search code examples
databasedatabase-designinfluxdb

Comparison on tags (indexed) in Influxdb don't work


I am using InfluxDb as my database. In InfluxDb, the data is stored in measurements (similar to table), tags (indexed) and fields.

Tags are indexed and thus result in faster retrieval of data.

My query parameters are start time and end time and I am keeping them in tags.

INSERT supply,start_time=10,end_time=20 value=100

Here start_time and end_time are stored in string format.

SELECT * FROM supply WHERE start_time > 5

returns NULL because as the tags are stored in string.

Question 1: is there any way to do comparison in tags? If yes, how?

Question 2: I can query on start_time and end_time by making them fields. But they are not indexed. So should I switch to some other db like Postgres if it involves a lot of overhead in querying with fields or is it fine if I query on unindexed fields. I have about 10 Million+ entries in database and need quick retrieval.


Solution

  • Question-1: Yes, you can compare tags in Influxdb but only with "=" operator. Since ,tags are by default stored in string datatype so "=" operator works for them but not the inequality and therefore, query will be slow if you query based on fields. Here is part taken from influxdb documentation -

      Tags are by default stored in string datatype so "=" operator works 
      for them but not the inequality and therefore, 
      query will be slow if you query based on fields .
    

    Writing Data in Influxdb

    Also , InfluxDB also stores time at which the data point was being inserted . Check if you can use this default timestamp also .

    Question2-: Before moving to sql databases like postgresql , you should do a benchmarking evaluations on both DBs with respect to the insertions , query optimizations ,cluster management(if needed) .

    Note: I am using influxdb query in the following way -

    select sum(field_value) as FIELD_VALUE_SUM from measurement_name where tag_key='1234' and time> '2017-03-10 00:00:00' and time<'2017-05-10 00:00:00';
    

    and get output as -

     name: measurement_name
     time                   FIELD_VALUE_SUM
     ----                   ------
     1489104000000000001    1058
    

    Let me know , if I missed few point and I will try to update my answer.

    Thanks