I am trying to incorporate the time series database with the laboratory real time monitoring equipment. For scalar data such as temperature the line protocol works well:
temperature,site=reactor temperature=20.0 1556892576842902000
For 1D (e.g., IR Spectrum) or higher dimensional data, I came up two approaches to write data.
ir_spectrum,site=reactor w1=10.0,w2=11.2,w3=11.3,......,w4000=2665.2 1556892576842902000
ir_spectrum,site=reactor data="[10.0, 11.2, 11.3, ......, 2665.2]" 1556892576842902000
I wonder whether there is any recommended way to store the high dimensional data? Thanks!
The first approach is better from the performance and disk space usage PoV. InfluxDB stores each field in a separate column. If a column contains similar numeric values, then it may be compressed better compared to the column with JSON strings. This also improves query speed when selecting only a subset of fields or filtering on a subset of fields.
P.S. InfluxDB may need high amounts of RAM for big number of fields and big number of tag combinations (aka high cardinality). In this case there are alternative solutions, which support InfluxDB line protocol and require lower amounts of RAM for high cardinality time series. See, for example, VictoriaMetrics.