Getting measurement into influxdb, nosql database

I have a measurement im want to persist in a influxdb database. The measurement itself consists of approx. 4000 measurement points which are generated by a microcontroller. Measurement points are in float format and are generated periodically (every few minutes) with a constant frequency. I trying to get some knowledge for NoSQL databases, influxdb is my first try here.

Question is: How do I get these measurements in the influxdb assuming they are within an mqtt-message (in json format)? How are the insert strings generated/handled?

{
  "begin_time_of_meas": "2020-11-19T16:02:48+0000",
  "measurement": [
    1.0,
    2.2,
    3.3,
    ...,
    3999.8,
    4000.4
  ],
  "device": "D01"
}

I have used Node-RED in the past and i know there is a plugin for influx db, so i guess this would be a way. But im an quite unsure how the insert string is genereated/handled for an array of measurement points. Every exmaple i have seen so far handles only 1 point measurements like one temperature measurement every few seconds or cpu load. Thanks for your help.

Solution

I've successfully used the influxdb plugin with a time precision of milliseconds. Not sure how to make it work for more precise timestamps, and I've never needed to.

It sounds like you have more than a handful of points arriving per second; send groups of messages as an array to the influx batch node.

In your case, it depends what those 4000 measurements are, and how it best makes sense to group them. If the variables all measure the same point, something like this might work. I've no idea what the measurements are, etc. A function that takes the mqtt message and converts it to a block of messages like this might work well (note that this function output could replace the join node):

[{
    measurement: "microcontroller_data",
    timestamp: new Date("2020-11-19T16:02:48+0000").getTime(),
    tags: {
        device: "D01",
        point:  "0001",
    },
    fields: {
        value: 1.0
    }
},
{
    measurement: "microcontroller_data",
    timestamp: new Date("2020-11-19T16:02:48+0000").getTime(),
    tags: {
        device: "D01",
        point:  "0002",
    },
    fields: {
        value: 2.2
    }
}, 
...etc...
]

That looks like a lot of information to store, but measurement and tags values are basically header values that don't get written with every entry. The fields values do get stored, but these are compressed. The json describing the data to be stored is much larger than the on-disk space the storage will actually use.

It's also possible to have multiple fields, but I believe this will make data retrieval trickier:

{
    measurement: "microcontroller_data",
    timestamp: new Date("2020-11-19T16:02:48+0000").getTime(),
    tags: {
        device: "D01",
        point:  "0001",
    },
    fields: {
        value_0001: 1.0,
        value_0002: 2.2,
        ...etc...
    }
}

Easier to code, but it would make for some ugly and inflexible queries.

You will likely have some more meaningful names than "microcontroller_data", or "0001", "0002" etc. If the 4000 signals are for very distinct measurements, it's also possible that there is more than one "measurement" that makes sense, e.g. cpu_parameters, flowrate, butterflies, etc.

Parse your MQTT messages into that shape. If the messages are sent one-at-a-time, then send to a join node; mine is set to send after 500 messages or 1 second of inactivity; you'll find something that fits.

If the json objects are grouped into an array by your processing, send directly to the influx batch node.

In the influx batch node, under "Advanced Query Options", I set the precision to ms because that's the default from Date().getTime().