Search code examples
pythondjangodatabasesqlitebackground-task

long runner to regularly update Django DB with web harvested data -?


My Django project has a user facing site, which infrequently inserts new objects into the user-DB-table (later there will be hundreds, perhaps thousands).

Additionally, a few times per day, I need to query an external webservice, for all my objects - but slowly, so not to violate their "1 request per 10 seconds" rule. So this 2nd task is a long runner, I cannot route it through an urls.py request.

Thus, besides the (uWSGI) server process ... I will have a 2nd process running in the background; it needs to work with the same database (using the default sqlite3).

  • can I run into concurrency issues when accessing the same DB? (How) can Django protect me from that? (If not) How to solve?
  • importing my models.py - is that all I need to do to access the django DB?
  • any other hints that might help me?

Thanks a million!
Stackoverflow rocks!

:-)


Solution

  • Django provides facilities for writing command line applications as well. This way you have access to all the same things you would from the web process, including the models.

    The concurrency is handled by your database, not django, so you don't have to worry about that. The only thing you might want to use are transactions if you have to write multiple pieces of data at the same time and they cannot be out of sync.

    class Command(BaseCommand):
        def handle(self, *args, **kwargs):
            try:
                while True:
                    self._fetch_data()
                    time.sleep(10)
            except KeyboardInterrupt:
                pass
    
        def _fetch_data(self):
            data = ... # fetch data here
            MyModel.objects.create(foo=data.foo, bar=data.bar) # insert into db
    

    Assuming the above class is inside management/commands/mycommand.py folder in one of your apps, execute it by using manage.py:

    python manage.py mycommand