Search code examples
influxdbinfluxdb-2

InfluxdB Flux Query group by substring


I search a way to group by substring/regex of a column:

I search in the documentation https://docs.influxdata.com/ without success.

The column "proc" is composed as follow:

    CPU-010.CORE010-00
    CPU-010.CORE010-01
    CPU-010.CORE010-02
    CPU-010.CORE010-03
    CPU-010.CORE010-04
    CPU-010.CORE010-05
    ...
    CPU-020.CORE020-00
    CPU-020.CORE020-01
    CPU-020.CORE020-02
    CPU-020.CORE020-03
    CPU-020.CORE020-04
    ...
    CPU-110.CORE110-00
    CPU-110.CORE110-01
    CPU-110.CORE110-02
    CPU-110.CORE110-03
    CPU-110.CORE110-04
    ...
    CPU-120.CORE120-00
    CPU-120.CORE120-01
    CPU-120.CORE120-02
    CPU-120.CORE120-03
    etc..

The csv imported is composed as follow:

#datatype measurement,tag,tag,double,dateTime:number
Processors,srv,proc,usage,time
Processors,srv1,CPU-010.CORE010-00,52,1671231960
Processors,srv1,CPU-010.CORE010-00,50,1671232020
Processors,srv1,CPU-010.CORE010-00,49,1671232080
Processors,srv1,CPU-010.CORE010-00,50,1671232140
Processors,srv1,CPU-010.CORE010-00,48,1671232200
Processors,srv1,CPU-010.CORE010-00,53,1671232260
...
Processors,srv1,CPU-020.CORE020-00,52,1671231960
Processors,srv1,CPU-020.CORE020-00,50,1671232020
Processors,srv1,CPU-020.CORE020-00,49,1671232080
Processors,srv1,CPU-020.CORE020-00,50,1671232140
Processors,srv1,CPU-020.CORE020-00,48,1671232200
Processors,srv1,CPU-020.CORE020-00,53,1671232260
...

I tried with this query without success:

from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "Processors" and proc.p =~ /(CPU-[09].*)[.].*/)
  |> group(columns: ["proc"])
  |> aggregateWindow(every: v.windowPeriod, fn: mean)

I tried to group as follow:

CPU-010
CPU-020
CPU-110
CPU-120
Etc..

Many thank for any help


Solution

  • Since you are trying to group by the prefix of the column names, you could try following:

    1. create a new column based on the prefix
    2. group by the new column

    Sample code is as follows:

    from(bucket: v.bucket)
       |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
       |> filter(fn: (r) => r._measurement == "Processors")
       |> map(fn: (r) => ({r with newProc: strings.substring(v: r._value, start: 0, end: 7)}))
       |> group(columns: ["newProc"])
       |> aggregateWindow(every: v.windowPeriod, fn: mean)  
    

    Notice: |> aggregateWindow(column: "_value", every: v.windowPeriod, fn: mean) ou can specify the column this way. Before trying this, try comment out this line to see the results before the aggregation, especially the column names.

    See more details for map function and substring function and aggregateWindow function.