Search code examples
djangodjango-rest-frameworkinfluxdbwal

Store datapoints persistently before writing to database


I have a Django app that receives sensor data. This data is then processed and written to influxDB using the influxdb-client-python library. I would like to write the data in an asynchronous manner and thus return a response to the producer of the data before it is actually written to the database.

However, once I send this response I can no longer afford to lose this data. Since I can never be sure that the server will in fact be able to write the data to influxDB, I was thinking about first writing it to a file and returning a response after this is successful (similar to a WAL). This introduces new problems like making sure the WAL is actually written to a file and most importantly, making sure it is thread-safe since the server may be handling multiple requests concurrently.

Is there any better way of doing this, maybe built-in in Django?


Solution

  • That sounds like a queue, background tasks, etc. You are right, it just displaces the issue, but queues are highly reliable as well.

    The standard libs for doing this with Django and/or Rest Framework are:

    • Celery - Full featured, mature, pluggable queue backends
    • Django RQ - Simpler, uses only redis

    Celery is probably the right starting point here, since it lets you use a "real" queue backend, but Django RQ+redis will also work if there isn't a ton of load.

    Without knowing anything more about your app or architecture, its hard to say more. There are a lot of queuing systems (Rabbit, ZeroMQ, AWS SQS, Google's, etc). You can also look into building the queue+processor using, for example, AWS SQS and AWS Lambda Functions (or google versions).