Search code examples
pythonpython-asyncioinfluxdbaioinflux

Why does calling `write` in aioinflux with an iterable of a user-defined dataclass seem to only insert the last item?


I've been trying to insert the data contained in a list, inside a running influxdb server. The list contains items of the following type CoordInfluxData:

from aioinflux.serialization.usertype import FLOAT, INT, TIMEINT, lineprotocol
from dataclasses import dataclass

@lineprotocol(
    schema=dict(lon=INT, lat=INT, time=TIMEINT, humidity=FLOAT, wind_speed=FLOAT)
)
@dataclass
class CoordInfluxData(dict):
    lon: int
    lat: int
    time: int
    humidity: float
    wind_speed: float

I'm using influxdb1.8 and I can't understand why inserting an iterable of such a user-defined dataclass, not only seems to only insert the last item of the iterable in the db, but even if I explicitly call write and provide a measurement argument, the measurement does not get created in the db. The only measurement that gets created has the same name as the custom dataclass I attempt to write.

Here's a sample script

import asyncio
from dataclasses import dataclass

from aioinflux import InfluxDBClient
from aioinflux.serialization.usertype import FLOAT, INT, TIMEINT, lineprotocol


@lineprotocol(
    schema=dict(lon=INT, lat=INT, time=TIMEINT, humidity=FLOAT, wind_speed=FLOAT)
)
@dataclass
class CoordInfluxData(dict):
    lon: int
    lat: int
    time: int
    humidity: float
    wind_speed: float


async def main():
    db_name = "coord_data"
    data = [
        CoordInfluxData(
            lon=164, lat=-15, time=1649938757, humidity=75, wind_speed=5.36
        ),
        CoordInfluxData(lon=33, lat=-18, time=1649938757, humidity=73, wind_speed=0.99),
        CoordInfluxData(
            lon=139, lat=18, time=1649938757, humidity=86, wind_speed=15.13
        ),
    ]

    client = InfluxDBClient(db=db_name)
    await client.create_database(db_name)

    await client.write(data, "weather_data")

    await client.close()


# Call main
asyncio.run(main())



Solution

  • After exhaustively studying the reported issue and by carefully looking at the provided data I believe that the problem is that InfluxDB identifies the provided points as duplicates. As the docs say:

    A point is uniquely identified by the measurement name, tag set, and timestamp. If you submit a new point with the same measurement, tag set, and timestamp as an existing point, the field set becomes the union of the old field set and the new field set, where any ties go to the new field set