Is there any difference between this two forms?
myMetric value1=1,value2=2
and this
myMetric.value1 v=1
myMetric.value2 v=2
Both store the same data (two points). Obviously, they are accessible in different way, but I mean is there any difference in storage, performance etc? As per this talk, the first one gets converted to the second one, at least semantically.
According to Influx docs for influx line protocol:
<measurement>[,<tag_key>=<tag_value>[,<tag_key>=<tag_value>]] <field_key>=<field_value>[,<field_key>=<field_value>] [<timestamp>]
You first form inserts one record into measurement myMetric
without tags, two fields (value1
,value2
) having values 1
and 2
respectively. Since there is not timestamp supplied in data server timestamp will be used for data point.
In second case you are creating two separate measurements: myMetric.value1
and myMetric.value2
each having one field named v
with values 1
and 2
respectively. Timestamps for them are likely to be different too, taking into account default nanosecond precision.
So, these two cases are not equivalent.
Using influx
cli tool these cases look like:
> INSERT myMetric value1=1,value2=2
> show measurements
name: measurements
name
----
myMetric
> show field keys from myMetric
name: myMetric
fieldKey fieldType
-------- ---------
value1 float
value2 float
> select * from myMetric
name: myMetric
time value1 value2
---- ------ ------
1526032578114702408 1 2
For the second case:
> INSERT myMetric.value1 v=1
> INSERT myMetric.value2 v=2
> show measurements
name: measurements
name
----
myMetric.value1
myMetric.value2
> select * from "myMetric.value1"
name: myMetric.value1
time v
---- -
1526032859752277164 1
> select * from "myMetric.value2"
name: myMetric.value2
time v
---- -
1526032864711858673 2
As you see in case 1 you have 1 insert operation into one measurement for one datapoint with two fields in it. In case 2 there are two insert operations into two distinct measurements having one field each.
Thus if in your use case value1 and value2 are usually inserted together I would expect first variant to be more performant. Case 2 will require 2 inserts for same data. Storage usage is likely to be approximately the same.
If value1 and value2 are inserted independently and at different times case 2 can be a bit more efficient in terms of storage as it will not have to store nulls for datapoints like (null,2)
or (1,null)
.
Also having data fields in separate measurements has another drawback: queries like:
> select value1, value2,value2-value1 from myMetric
name: myMetric
time value1 value2 value2_value1
---- ------ ------ -------------
1526032578114702408 1 2 1
won't be possible in second case.