Search code examples
influxdbinfluxdb-2

What are series and bucket in InfluxDb


While trying to understand different concepts of InfluxDb I came across this documentation, where there is a comparision of terms with SQL database.

An InfluxDB measurement is similar to an SQL database table.
InfluxDB tags are like indexed columns in an SQL database.
InfluxDB fields are like unindexed columns in an SQL database.
InfluxDB points are similar to SQL rows.

But there are couple of other terminology which I came across, which I could not clearly understand and wondering if there is an SQL equivalent for that.

Series
Bucket

From what I understand from the documentation

series is the collection of data that share a retention policy, measurement, and tag set.

Does this mean a series is a subset of data in a database table? Or is it like database views ?
I could not see any documentation explaining buckets. I guess this is a new concept in 2.0 release

Can someone please clarify these two concepts.


Solution

  • I have summarized my understanding below:

    • A bucket is named location with retention policy where time-series data is stored.
    • A series is a logical grouping of data defined by shared measurement, tag and field.
    • A measurement is similar to an SQL database table.
    • A tag is similar to indexed columns in an SQL database.
    • A field is similar to unindexed columns in an SQL database.
    • A point is similar to SQL row.

    For example, a SQL table workdone:

    Email Status time Completed
    lorr@influxdb.com start 1636775801000000000 76
    lorr@influxdb.com finish 1636775868000000000 120
    marv@influxdb.com start 1636775801000000000 0
    marv@influxdb.com finish 1636775868000000000 20
    cliff@influxdb.com start 1636775801000000000 54
    cliff@influxdb.com finish 1636775868000000000 56

    The columns Email and Status are indexed.

    Hence:

    • Measurement: workdone
    • Tags: Email, Status
    • Field: Completed
    • Series (Cardinality = 3 x 2 = 6):
      1. Measurement: workdone; Tags: Email: lorr@influxdb.com, Status: start; Field: Completed
      2. Measurement: workdone; Tags: Email: lorr@influxdb.com, Status: finish; Field: Completed
      3. Measurement: workdone; Tags: Email: marv@influxdb.com, Status: start; Field: Completed
      4. Measurement: workdone; Tags: Email: marv@influxdb.com, Status: finish; Field: Completed
      5. Measurement: workdone; Tags: Email: cliff@influxdb.com, Status: start; Field: Completed
      6. Measurement: workdone; Tags: Email: cliff@influxdb.com, Status: finish; Field: Completed

    Splitting a logical series across multiple buckets may not improve performance but may complicate flux query as need to include multiple buckets.